SEIZE THE DATA. 2015 SEIZE THE DATA. 2015



Similar documents
How To Use Hp Vertica Ondemand

SEIZE THE DATA SEIZE THE DATA. 2015

HP Vertica. Echtzeit-Analyse extremer Datenmengen und Einbindung von Hadoop. Helmut Schmitt Sales Manager DACH

HPE Vertica & Hadoop. Tapping Innovation to Turbocharge Your Big Data. #SeizeTheData

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

Architecture & Experience

From Spark to Ignition:

High-Performance Analytics

Predictive Analytics

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R

Safe Harbor Statement

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

In-Database Analytics

whitepaper Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Advanced In-Database Analytics

Hadoop s Advantages for! Machine! Learning and. Predictive! Analytics. Webinar will begin shortly. Presented by Hortonworks & Zementis

What s Cooking in KNIME

Technical white paper. R you ready? Turning big data into big value with the HP Vertica Analytics Platform and R

BIG DATA What it is and how to use?

I/O Considerations in Big Data Analytics

6.0, 6.5 and Beyond. The Future of Spotfire. Tobias Lehtipalo Sr. Director of Product Management

Integrating Apache Spark with an Enterprise Data Warehouse

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics

Bringing the Power of SAS to Hadoop. White Paper

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Advanced Big Data Analytics with R and Hadoop

An In-Depth Look at In-Memory Predictive Analytics for Developers

Il mondo dei DB Cambia : Tecnologie e opportunita`

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Dell In-Memory Appliance for Cloudera Enterprise

Extend your analytic capabilities with SAP Predictive Analysis

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

BIG DATA TRENDS AND TECHNOLOGIES

Confidently Anticipate and Drive Better Business Outcomes

CitusDB Architecture for Real-Time Big Data

Big Data Analytics with Spark and Oscar BAO. Tamas Jambor, Lead Data Scientist at Massive Analytic

Big Data on Microsoft Platform

Interactive data analytics drive insights

Big Data Analytics: Today's Gold Rush November 20, 2013

EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data

SAP Predictive Analytics: An Overview and Roadmap. Charles Gadalla, SESSION CODE: 603

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Databricks. A Primer

Trafodion Operational SQL-on-Hadoop

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

Big Data and Analytics 21 A Technical Perspective Abhishek Bhattacharya, Aditya Gandhi and Pankaj Jain November 2012

Databricks. A Primer

Big Data at Spotify. Anders Arpteg, Ph D Analytics Machine Learning, Spotify

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

An Oracle White Paper June Oracle: Big Data for the Enterprise

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

The Internet of Things and Big Data: Intro

Big Data and Market Surveillance. April 28, 2014

Greenplum Database. Getting Started with Big Data Analytics. Ofir Manor Pre Sales Technical Architect, EMC Greenplum

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Reference Architecture, Requirements, Gaps, Roles

Internet of Things. Opportunity Challenges Solutions

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014

Universal PMML Plug-in for EMC Greenplum Database

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

In-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

The 4 Pillars of Technosoft s Big Data Practice

Please give me your feedback

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

Actian SQL in Hadoop Buyer s Guide

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Focus on the business, not the business of data warehousing!

Empowering the Masses with Analytics

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

APPROACHABLE ANALYTICS MAKING SENSE OF DATA

An Oracle White Paper September Oracle: Big Data for the Enterprise

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Streaming items through a cluster with Spark Streaming

Predictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics

3 Reasons Enterprises Struggle with Storm & Spark Streaming and Adopt DataTorrent RTS

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Safe Harbor Statement

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

ANALYTICS CENTER LEARNING PROGRAM

Transcription:

1 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

BIG DATA CONFERENCE 2015 Boston August 10-13 Predicting and reducing deforestation using HP Distributed R and Jorge Ahumada, Executive Director, TEAM Network, Conservation International Sunil Venkayala, Senior Technical Product Manager, HP Big Data software Aug 11th, 2015

Simplify operationalizing predictive analytics 3 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Data science: Process flow Understand problem Operationalize and monitor Explore data assets Model and evaluate Prepare data 4 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Data scientist: Bridging the business and IT gap Business Achieve competitive advantage with predictive and prescriptive insights from data Data scientist Build and deploy actionable analytic solutions to meet business goals IT Ensure system architecture, data management and security 5 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Challenges: Operationalizing predictive analytics 6 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Challenges: Increase in data sizes/types for predictive analytics Source: TDWI Research, Predictive Analytics for Business Advantage, 2014 Visit tdwi.org/bpreports for more information 7 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Data integration: Shared-nothing massively parallel processing columnar database Volume Velocity Variety Bulk load to Memory Trickle load Flex Zone Bulk load to Disk Kafka* IDOL CFS 35 TB SLA at Facebook Low-latency in seconds Structured, Semi and Unstructured 8 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Data preparation: and vertica.dplyr* Fast query performance with comprehensive SQL support Data Cleaning Missing value handling Outlier handling Binning and Normalization Filtering and Aggregations Derive new features Filter irrelevant cases Join multiple data sets Data Management User management Data Security High availability 9 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Build predictive models: HP Distributed R Open-source scalable distributed computing platform native to R Native to R Distributed Data Structures Out-of-box parallel algorithms Classification Random Forest, Logistic Regression Vertica Integration Native parallel data connector Distributed Computing API Regression Linear Regression vertica.dplyr Open-source with HP Support Clustering K-Means HPData 10 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Operationalize predictive models: and Distributed R Deploy R models In-database scoring BI/Application Integration Have Predictive models near Data Model management in high-available storage Model metadata management Out Of Box SQL predict functions Low memory footprint with high-scalability Data security Visual Predictive Insights partner ecosystem Embedded analytics using JDBC/ODBC 11 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

HP Haven Predictive Analytics Delivering scale and performance with Distributed R breakthrough technology 1 2 Build models 1 Ingest and prepare data by leveraging 2 Build and evaluate predictive models on large data sets using Distributed R 3 BI integration Deploy models (In-database scoring) Evaluate models HP powered clustered computing 3 Deploy models to Vertica and use in-database scoring to produce prediction results for BI and applications A scalable, high-performance engine for the R language developed by HP Labs Natively integration to Compatible with popular tools like R Studio and existing R libraries Open source supported by HP with enterprise-class support 12 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Forward-looking statements This is a rolling (up to three year) Roadmap and is subject to change without notice. This document contains forward looking statements regarding future operations, product development, product capabilities and availability dates. This information is subject to substantial uncertainties and is subject to change at any time without prior notification. Statements contained in this document concerning these matters only reflect Hewlett Packard's predictions and / or expectations as of the date of this document and actual results and future plans of Hewlett-Packard may differ significantly as a result of, among other things, changes in product strategy resulting from technological, internal corporate, market and other changes. This is not a commitment to deliver any material, code or functionality and should not be relied upon in making purchasing decisions. 13 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

HP confidential information This Roadmap contains HP Confidential Information. If you have a valid Confidential Disclosure Agreement with HP, disclosure of the Roadmap is subject to that CDA. If not, it is subject to the following terms: for a period of 3 years after the date of disclosure, you may use the Roadmap solely for the purpose of evaluating purchase decisions from HP and use a reasonable standard of care to prevent disclosures. You will not disclose the contents of the Roadmap to any third party unless it becomes publically known, rightfully received by you from a third party without duty of confidentiality, or disclosed with HP s prior written approval. 14 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

This is a rolling (up to 3 year) roadmap and is subject to change without notice New distributed algorithms Generalized boosting models Ensemble modeling for robust prediction accuracy Pattern mining (association rules) Cross-sell or up-sell based on buying patterns Attribute/features importance (randomforest) Identify important attributes in high-dimensional data Decision trees Discover business rules from data 15 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

This is a rolling (up to 3 year) roadmap and is subject to change without notice Co-location of Distributed R in Vertica Node 16 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

This is a rolling (up to 3 year) roadmap and is subject to change without notice HDFS direct data connectors CSV and ORC HDFS (csv, orc) HDFS (csv, orc) HDFS (csv, orc) HDFS (csv, orc) 17 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

This is a rolling (up to 3 year) roadmap and is subject to change without notice Co-location of Distributed R in Hadoop Node 18 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

This is a rolling (up to 3 year) roadmap and is subject to change without notice Native Vertica connector to Spark RDD, Data frames 19 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

This is a rolling (up to 3 year) roadmap and is subject to change without notice Deploy Spark Models In-database UDx 20 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

This is a rolling (up to 3 year) roadmap and is subject to change without notice HP Haven Predictive Analytics Apache Spark MLLib Native Connectors to 1 2 Build models 1 Ingest and prepare data by leveraging 2 Build and evaluate predictive models on large data sets using Spark MLLib 3 BI integration Deploy models (In-database scoring) Evaluate models 3 Deploy models to Vertica and use in-database scoring to produce prediction results for BI and applications 21 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.