Analytics. For Anyone. Be Heroic Turn Data into Action



Similar documents
Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Extend your analytic capabilities with SAP Predictive Analysis

The Future of Data Management

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

Databricks. A Primer

Find the Hidden Signal in Market Data Noise

The Future of Data Management with Hadoop and the Enterprise Data Hub

Comprehensive Analytics on the Hortonworks Data Platform

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

KnowledgeSEEKER POWERFUL SEGMENTATION, STRATEGY DESIGN AND VISUALIZATION SOFTWARE

Native Connectivity to Big Data Sources in MSTR 10

SAP Predictive Analysis: Strategy, Value Proposition

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Predictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics

How To Understand Business Intelligence

SAP Predictive Analytics: An Overview and Roadmap. Charles Gadalla, SESSION CODE: 603

Cisco Data Preparation

Harnessing the power of advanced analytics with IBM Netezza

Databricks. A Primer

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

APPROACHABLE ANALYTICS MAKING SENSE OF DATA

EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

Hadoop & SAS Data Loader for Hadoop

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

RapidMiner looks to step up advanced analysis business, adds to processing options

Data Integration Checklist

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Confidently Anticipate and Drive Better Business Outcomes

Bringing the Power of SAS to Hadoop. White Paper

Transforming the Telecoms Business using Big Data and Analytics

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

HDP Enabling the Modern Data Architecture

An In-Depth Look at In-Memory Predictive Analytics for Developers

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

Big Data and Data Science: Behind the Buzz Words

Hadoop s Advantages for! Machine! Learning and. Predictive! Analytics. Webinar will begin shortly. Presented by Hortonworks & Zementis

Ad Hoc Analysis of Big Data Visualization

KnowledgeSEEKER Marketing Edition

SAP Predictive Analysis: Strategy, Value Proposition

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

SAP Predictive Analytics

Using Tableau Software with Hortonworks Data Platform

Sunnie Chung. Cleveland State University

The 4 Pillars of Technosoft s Big Data Practice

whitepaper Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R

Microsoft Big Data. Solution Brief

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Data processing goes big

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Advanced Big Data Analytics with R and Hadoop

Introducing Oracle Exalytics In-Memory Machine

Big Data Visualization and Dashboards

Sisense. Product Highlights.

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Advanced In-Database Analytics

Roadmap Talend : découvrez les futures fonctionnalités de Talend

Navigating Big Data business analytics

An Oracle White Paper June Oracle: Big Data for the Enterprise

HDP Hadoop From concept to deployment.

BIG Data Analytics Move to Competitive Advantage

RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE. Luigi Grimaudo Database And Data Mining Research Group

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

How To Handle Big Data With A Data Scientist

White Paper. Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics

Informatica and the Vibe Virtual Data Machine

Architecting for the Internet of Things & Big Data

SEIZE THE DATA SEIZE THE DATA. 2015

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

Oracle Big Data Strategy Simplified Infrastrcuture

Three Open Blueprints For Big Data Success

Advanced Analytics: The Hurwitz Victory Index Report

TOP 8 TRENDS FOR 2016 BIG DATA

Advanced Analytics: The Hurwitz Victory Index Report

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Information Builders Mission & Value Proposition

Worldwide Advanced and Predictive Analytics Software Market Shares, 2014: The Rise of the Long Tail

Tax Fraud in Increasing

Big Data Architectures. Tom Cahill, Vice President Worldwide Channels, Jaspersoft

INVESTOR PRESENTATION. First Quarter 2014

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Predictive Analytics

ANALYTICS CENTER LEARNING PROGRAM

SAP BusinessObjects Business Intelligence 4.1 One Strategy for Enterprise BI. May 2013

Revolution R Enterprise

Big Data at Cloud Scale

Cloudera Enterprise Data Hub in Telecom:

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

Big Data Integration: A Buyer's Guide

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

The Future of Business Analytics is Now! 2013 IBM Corporation

Transcription:

Analytics. For Anyone. Be Heroic Turn Data into Action

Progressive businesses must accelerate time-to-value not only to thrive, but survive. Analytics on big data is no longer just a competitive advantage. It s a Business Requirement. 2

Built by data scientists for data scientists, businesses analysts, and developers. RapidMiner is the industry's easiest-to-use Modern Analytics Platform that significantly accelerates productivity from data prep to predictive action. Unlike traditional analytics providers, RapidMiner enables anyone to make the most of all data in all environments, creating a powerful advantage from the wisdom of over 250,000 users. 3

The Analytics Spectrum SQL Analytics Descriptive Statistics Data Mining Predictive Analytics Simulation Optimization Count Mean Univariate distribution Central tendency Dispersion Association rules Clustering Feature extraction Classification Regression Time series Text Spatial Machine learning Monte Carlo Agent based modeling Discrete event modeling Linear optimization Non-linear optimization Business Intelligence Advanced Analytics 4

Key Trends & Drivers in Modern Analytics Market Forces Internet of Things Consumerization Mass Personalization Modern Business Accelerate time-to-value Maximize business value Simplify getting to value Technology Big Data New compute engines Cloud 5

Evolving Advanced Analytics Market Limitless Big Data New compute engines Cloud T Traditional Limitations Limited handling of variety of data source Legacy compute engines On-premises, if not offline Modern 6

Advanced Analytics Market Maturity High-velocity innovation Lagging innovation T Traditional Modern 7

Traditional vs. Modern Analytics Market Challengers Leaders Magic Quadrant for Advanced Analytics Platforms February 2014 StatSoft SAS RapidMiner IBM Angoss SAP Knime Oracle Megaputer FICO Revolution Analytics Microsoft InfoCentricity Alpine Data Labs Alteryx Ability to execute Actuate Niche players Visionaries Completeness of vision 8

Skills Gap in the Modern Analytics Market Computer Science Searching for Unicorns McKinsey projects by 2018 there will be a shortage of 1.7M professionals with analytics expertise in the U.S. alone. Domain Expertise + + + Data Scientist + Statistician Actuarial Quant Math Data Science Skills Gap in 2018 (McKinsey 2012) 9

Unlocking Value with Modern Analytics Computer Science Business Analysts Next Generation Data Scientist (aka: the hero you are looking for) Domain Expertise + + + Data Scientist + Statistician Actuarial Quant Math 10

Enter RapidMiner. Analytics. For Anyone. Accelerate Pre-Built Models One-Click Deployments Connect All Data All Environments Simplify Code Free Wisdom of Crowds 11

Wisdom of Crowds How do we create data science heroes? 2 Store them in a knowledge base of analytic best practices 1 Anonymously collect analytic models from analysts across the enterprise 3 Use machine learning algorithms to recommend and empower any user at any skill level to become a data science hero 12

RapidMiner Modern Analytics Platform RapidMiner Studio Code free design your analytics using 1500+ operators Studio Design Business Analysts Code Free GUI Orchestrate Engine Data Scientists Consume Business Users Web App Biz App BI Machine Custom App Viz RapidMiner Radoop Push down computations to where your data lives RapidMiner Streams Analyze streaming data while in motion Studio Engine In-Memory Radoop Streams Cloud Server Engine Engine Engine Engine Engine Engine Web Services API Compute In-Hadoop In-Database In-Stream RapidMiner Cloud Elastic compute environment for high performance analytics RapidMiner Server Enterprise analytics environment for integration with business processes 13

RapidMiner Radoop Architecture Studio Server Code free design in RapidMiner with 70+ Operators VISUAL DEVELOPMENT Radoop Data Integration Data Discovery Data Prep Model Building Model Validation Model Scoring PROGRAMMING CODE Hive (SQL) Hadoop environment Pig (Scripting) MapReduce Mahout (Machine Learning) HDFS Impala (In-memory SQL) YARN Spark (MLib) One-click push down to Hadoop environment Optimized distributed execution in Hadoop environment 14

RapidMiner Streams Architecture Studio Server Code free design in RapidMiner leveraging 1500+ Operators VISUAL DEVELOPMENT Streams Data Integration Data Prep Model Scoring Application push Message broker Apache Kafka or Amazon SQS pull deploy process as topology Apache Storm cluster Storm Topology Spout Node Bolt Node Node Engine Bolt Bolt monitor and manage Bolt Bolt store One-click push down to Storm environment Distributed execution in Storm environment Node Application pull Storm Topology Streams Engine Cassandra MongoDB Redis SOLR 15

RapidMiner Server Architecture Web App Machine Web Services API Integrate analytic models into any type of application RESTful API Server Biz App BI Custom App Viz App Designer Build web based predictive reports Create ad hoc reports Build predictive apps without coding Embed app into business process Web Services API App Designer Shared Repository User Management Scheduler Engine Shared Repository Collaborate on analytic processes Share data, processes and models User Management Create and manage users, roles and access rights Scheduler Execute analytics at certain times or automate repeating execution patterns Engine High-performance compute engine for distributed and remote work 16

RapidMiner Modern Analytics Flow RapidMiner Data Sources Compute Engines Model Building & Scoring Model Deployment Model Consumption Work with any data, from any source Work in any environment, at any time Data Integration, Discovery & Preparation Model Building, Validation & Scoring Deploy models any way you want Embed your insights and take action 17

One Platform To Rule Them All Model Building and Scoring (1500+ operators, 200+ community contributed operators) Data Integration Data Discovery Data Preparation Model Building Model Validation Model Scoring 50+ data connectors with access to 100+ sources including 40+ file types Any data type Structured Semi-structured Unstructured Binary 700+ data parsing, data blending, data cleansing, transformations, aggregations, set operations, rotations,filtering, outlier detection, value type transformations, feature creation, window functions, feature extraction 20+ process control structures 25+ interactive data visualizations including: Data tables Scatter matrices Bubble charts Parallel coordinates Deviation plots 3-D scatter plots Density plots Histograms Survey plots Andrews curve Quartile Pareto charts Network & tree visualizations 30+ image format exports 45+ feature selection automatic & manual 20+ missing value replacement & Imputation 96+ feature creation automatic & manual 20+ anomaly & outlier detection 60+ dimension reduction / feature selection 20+ segmentation & clustering 80+ processing & feature extraction from unstructured data 25+ statistical 250+ machine learning 200+ association mining, frequent item set, similarity computation, feature weighting 10+ ensemble and hierarchical models 10+ model and parameter optimization Automatic model fitting Integration of 3 rd party analytics, optimization solvers or simulations tools 10+ cross validation 20+ visual evaluation 30+ numerical / nominal / categorical model performance criteria 10+ significance tests 5+ optimal threshold cutoff for binomial classes 5+ cluster performance measures Model scoring for all applicable model building Add l Analytics 50+ text analytics 15+ web mining 30+ image / audio / video mining 85+ time series 30+ financial & economics 18

Work with Any Data, from Any Source Data Sources (access to 450+ data sources) Flat Files Text Files Databases via JDBC or ODBC Database & Cube Queries MDX Hadoop Sources NoSQL database Cloud Data Sources Web Services Web pages Web services Mail Services POP3 IMAP 19 Logos & icons represent a partial list. Full list available upon request.

Work in Any Environment, at Any Time Compute Engines (in-memory, in-sql, in-database, in-cluster, in-hadoop, in-cloud, in-stream) In-memory In-SQL & In-database In-cluster In-Hadoop In-Cloud In-stream 20 Logos & icons represent a partial list. Full list available upon request.

Deploy Models Any Way You Want Model Deployment Model Scheduling Scheduled model execution Model Publishing Model Embedding Publishing model results via web services API into: Web services Web application Business application (ERP, CRM, Marketing Automation, etc.) Machines Streaming application Rule engines Complex event processing Business intelligence Data visualization Cloud application Custom application Embedding of model via Java API into: Any application Callable from any application Model Export PMML export 21

Embed Your Insights and Take Immediate Action Model Consumption Business Intelligence CRM Marketing Automation Cloud Applications (connecting to 300+ cloud services) ERP Custom Custom web applications, web portals 22 Logos & icons represent a partial list. Full list available upon request.

Get to Meaningful Business Value in a Snap Accelerate Drop development time from days to minutes Connect Automate data integration Simplify Make data science accessible to all 23

Design Your Analytics. Coding Not Required. Supercharge your results with +1500 analytic operators Liberate your business analysts with a code free environment Leverage the wisdom of over 250,000 users worldwide Boost your data science knowledge with interactive help 24

Machine Learning on Hadoop How do we become big data heroes??! Pushing data prep and machine learning into Hadoop clusters is complex and requires coding. Not an viable option! Push computations into Hadoop clusters from a code free environment. Heroes use RapidMiner!?! 25

Use Case Example: Churn Prevention with Hadoop Task: Separate loyal customers from customers who are likely to churn. Solution with Hadoop + Mahout + (a lot of) custom coding DAY 1 DAY 3 DAY 12 DAY 18 1. Define a schema and create tables for customer data, past transactions, service usage log files, and so on. Manually list columns, types, defining separator characters, etc. 2. Write HiveQL queries (or Pig scripts or other code) to aggregate transactions and service logs for each customer and calculate attributes describing them 3. Implement and execute a custom MapReduce job to convert data to Mahout s input format 4. Run the Mahout Naïve Bayes algorithm with proper parameters from the command line 5. Repeat each step for the customers you want to apply the model on 6. Implement and execute a custom MapReduce job to convert predictions back into a delimited format 7. Export the result from HDFS 8. Import the result into an RDBMS TIME: 3 WEEKS Disconnected individuals get bogged down in endless process, coding and queries. In the meantime, your competition beats you to the punch. 26

Use Case Example: Churn Prevention with Hadoop Task: Separate loyal customers from customers who are likely to churn. Solution with RapidMiner 1 Combine data from Hadoop and any traditional source 2 3 Train model in distributed Hadoop cluster Apply model in RapidMiner and integrate seamlessly TIME: 10 MINUTES Your team designs the process in collaboration with each other just like they would on a white board. And then you press play. That s it. 27

RapidMiner Radoop consistently delivers performance increases of up to 4,000% compared to pure scripting approaches* * RapidMiner results compared against traditional Hadoop approaches including data integration, data prep, modeling, deployment and maintenance. 28

RapidMiner Fills In The Skills Gap Computer Science Business Analysts Next Generation Data Scientists (this is the realm of heroes) Domain Expertise + + + Data Scientist + Statistician Actuarial Quant Math 29

Companies Around The World Use RapidMiner Technology Pharma & Healthcare Oil & Gas, Chemicals Government & Defense Consulting Manufacturing Aerospace Consumer Products Business Services Software & Analytics Financial Services Entertainment Academia Retail 30

Signature Customers 31

Process Customer Feedback In Multiple Languages To Increase Retention Rates Challenge: Applying basic voice-of-the-customerconcepts and text analytics to customer feedback in over 60 countries worldwide. Solution: Use RapidMiner s Platform to detect churn and identify customer service issues regardless of time, location or language. 150,000 customer comments and tweets in almost every language processed on RapidMiner Data Science Hero Spotlight Business executives, who hold the power to allocate text analytics resources, are beginning to see and realize the benefits to help better focus and solve business problems. -- Han-Sheong Lai Director of Operational Excellence & Customer Advocacy Accelerate Process massive amounts of text at high speed Connect Analyze multiple silos of global customer data Simplify Automatically determine intent-to-churn 32

Quickly Prototype Analytics Models for Under Armour Challenge Wearables Data Challenge: Quickly prototype analytics processes for Under Armour wearable data, for the Under Armour39 Challenge. Solution: Use RapidMiner s code free, drag and drop GUI to quickly design 11 analytics processes, iterate them for optimization, and win the challenge. 1.8M data points analyzed, per hour, by the Under Armour39 wearable Data Science Hero Spotlight RapidMiner is extremely powerful, has the best operators, and can handle Big Data from wearables. It also allows us to rapidly prototype sophisticated analytics, machine learning and classification applications, saving time and money. -- Kevin Logan CEO Accelerate Prototype multiple analytics processes quickly and easily Connect Analyze Big Data from wearables devices Simplify Use code free, drag and drop GUI for analytics 33

Track Data from Millions of Companies to Identify Critical Economic Drivers Challenge: Monitor corporate performance data in real time, and identify correlations, outliers, and economic drivers., Solution: Use RapidMiner s algorithms for rapid prototyping and visualizations for correlations, and to identify outlying, unusual, data. 4.5 M subject matter experts content analyzed in the United Kingdom Data Science Hero Spotlight We benefit from the public availability of extensions and the RapidMiner Marketplace. We can easily search for what others have designed in RapidMiner, and use the extensions that are a fit for us. -- Tom Gatten CEO Accelerate Prototype analytics and visualizations quickly Connect Analyze data from the digital footprint of UK businesses Simplify RapidMiner Marketplace public extensions 34

Search Millions of Patents Online and Automatically Mine Image Data Challenge: Search millions of patents online and automatically mine image data for applicable information., Solution: Use RapidMiner text and image mining to quickly and easily identify several thousand images of interest. 1M+ detailed patent records mined online, including images Data Science Hero Spotlight Some years ago (the patent team) had tried a dedicated patent classification tool that didn t work - RapidMiner does. It provides a framework for substantially reducing the time it takes us to find interesting patents. -- Thomas Hartmann Business Engineer Accelerate Automatically mine millions of online patent images Connect Search through a wide variety online data sources Simplify No programming required to connect insights to action 35

Television Broadcasters Project Drive Broadcast Revenues and Customer Retention with Streaming, Real-Time Analytics Challenge: Better understand TV viewing habits to prevent churn and optimize advertising. Solution: Process streaming Big Data from three million TV viewers, in real-time, to make program content recommendations and target advertising. <5s time to generate high value activities based on predictive analytics Data Science Hero Spotlight: RapidMiner allows us to leverage Big Data, in real-time, for the TV industry. -- Avi Bernstein Professor at the University of Zurich, Department of Informatics Accelerate Personalized recommendations in less than five seconds Connect Stream and analyze from set-top boxes, mobile devices and PCs Simplify Code free design of streaming analytics 36

Don t Take It From Us RapidMiner was most frequently selected based on ease of use, license cost, and speed of model development/ability to build large numbers of models. A number of templates guide users on the most common set of predictive use cases. Customer references cite high levels of satisfaction with the data access, data filtering and manipulation, predictive analytics and further advanced analytics components of the product. Gareth Herschel Research Director "Radoop also makes an eponymous product, focused on Hadoop analytics functionality, that is also visually-oriented and is 'powered by' RapidMiner itself, making the union quite logical. Andrew Brust Research Director RapidMiner is an excellent data mining and statistics platform with a large following. With version 6 the product and company became much more commercial, and the recent acquisition of Radoop puts it in the big data league. Martin Butler Research Director 37

Recognized Leader in Advanced Analytics Challengers StatSoft Angoss SAP Leaders SAS RapidMiner Knime IBM "Customer references cite high levels of satisfaction with the data access, data filtering and manipulation, predictive analytics and further advanced analytics components of the product. Oracle Megaputer FICO Revolution Analytics Microsoft InfoCentricity Alpine Data Labs Alteryx Ability to execute Actuate Niche players Visionaries Completeness of vision As of February 2014. Gartner Magic Quadrant for Advanced Analytics Platforms (Feb. 14). www.rapidminer.com/gartner2014 38

Our History RapidMiner was born from a data science project at the University of Dortmund, Germany, by Ingo Mierswa, Ralf Klinkenberg and Simon Fischer. Initially known as YALE in 2001, the product led to Rapid-I, a company founded by Ingo and Ralf in 2007. Later, the company was renamed to RapidMiner and in 2012, global HQ were established in Cambridge, Massachusetts, USA. Our Milestones 2007 Open Source 2010 Open Core 2013 Business Source 2014 Big Data & Cloud Global Users 5,000 30,000 150,000 250,000 2007 2010 2013 2014 Customers 600+ worldwide Corporate Locations North America EMEA Industries Manufacturing Retail/CPG Financial Utilities/Energy Investors Government Automotive Life Science Telecom Earlybird Venture Capital Open Ocean Capital 39

Activating the data science hero in every business analyst! www.rapidminer.com 40