Revolution R Enterprise: Faster Than SAS

Size: px
Start display at page:

Download "Revolution R Enterprise: Faster Than SAS"

Transcription

1 White Paper Revolution R Enterprise: Faster Than SAS Benchmarking Results by Thomas W. Dinsmore and Derek McCrae Norton

2 In analytics, speed matters. How much? We asked the director of analytics from a leading U.S. marketing services provider, a Revolution Analytics customer. Her team supports more than 1,000 predictive models currently in production; her clients expect the team to build a predictive model in 30 minutes or less. Previously, we shared results 1 from a performance test comparing Revolution R Enterprise (RRE) to SAS. 2 That test showed how our unique Parallel External Memory Algorithms (PEMA) technology produces vastly better performance for advanced analytics. Some readers noted that SAS and RRE were not tested running on the same hardware, because the test was limited to a single task. It was also pointed out that SAS offers software to enable deployment in clustered computing environments, similar to what was used for RRE. We listened and we set up a new test using the same hardware for both products. To help ensure a fair comparison, we hired an experienced SAS consultant to review the SAS programs, enable them for Grid computing and run the test. We used SAS 9.4 and defined a list of commonly used analytics tasks for the test. The results: g ScaleR ran the analysis tasks 42 times faster than SAS. g ScaleR outperformed SAS on every task. g The ScaleR advantage ranged from 10 to 300 times the performance. g The ScaleR advantage increased when we tested on larger data sets. g The new SAS HP PROCs, where available, only marginally improved SAS performance. In this white paper, we ll report on the approach we used for the test, together with detailed results. Approach For this test, Revolution Analytics engaged a consulting firm experienced with SAS Grid Manager. 3 The consultant set up a clustered computing environment consisting of five four-core machines running CentOS, all networked using Gigabit Ethernet connections and a separate NFS Server. The team deployed SAS Release 9.4, with the following major components: g Base SAS g SAS/STAT g SAS Grid Manager We used a desktop running SAS Management Console and SAS Enterprise Guide as the Grid Client. To test Revolution R Enterprise, we first deployed IBM Platform LSF and Platform MPI Release 9 on the grid, then installed Revolution R Enterprise Release 7 on each node. SAS Grid Manager uses an OEM version of IBM Platform LSF that cannot run concurrently with the standard version from IBM used by Revolution R Enterprise, so we ran the tests sequentially and re-configured the environment for each test SAS is a registered trademark of the SAS Institute, Inc. 3 The consulting firm prefers to remain anonymous. 2

3 To simplify test replication across different environments, we used data manufactured through a random process. Time needed to manufacture the data is not included in the benchmark results. Prior to running the actual tests, we loaded the randomized data into each software product s native file system: for SAS, an SAS data set; for Revolution R Enterprise, an XDF file. Although we have benchmarked Revolution R Enterprise on data sets as large as a billion rows, typical data sets used by even the largest enterprises for the statistical procedures investigated tend to be much smaller. We chose to perform the tests on wide files of 591 columns and row counts ranging from 100,000 to 5 million file sizes that represent what we consider to be typical for many analysts. We also ran scoring tests on narrow files of 21 columns with row counts ranging up to 50 million. For the tests, we defined a sequence of ten frequently used analysis tasks, plus one scoring task. The table below shows the tasks together with the actual SAS 9.4 and Revolution R Enterprise 7 (RRE 7) functions used in the benchmark programs. Table 1: Benchmark Tasks Task RRE 7 Capability SAS 9.4 Capability Descriptive statistics (n, min, max, mean, std) on 1 numeric variable rxsummary PROC SURVEYMEANS Median and deciles for 1 numeric variable rxquantile PROC SURVEYMEANS Frequency distribution for 1 text variable rxcube PROC FREQ Linear regression with 1 numeric response variable and 20 numeric predictors Linear regression with 1 numeric response variable and 20 mixed predictors Stepwise linear regression with 100 numeric predictors Logistic regression with 1 binary response variable and 20 numeric predictors Generalized linear model with numeric response variable, 20 numeric predictors, gamma distribution and link function rxlinmod rxlinmod rxlinmod rxlogit rxglm PROC REG PROC HPREG PROC GENMOD PROC REG PROC LOGISTIC PROC GENMOD k-means clustering with 20 active variables rxkmeans PROC FASTCLUS k-means clustering with 100 active variables rxkmeans PROC FASTCLUS Using the first linear model, score a file with 10x the number of records in the analysis file rxpredict PROC SCORE 3

4 We performed all benchmark tests sequentially, with no other operations running concurrently. The actual SAS 9.4 and RRE 7 programs used for this test are freely available to anyone at GitHub: https://github.com/revolutionanalytics/benchmark. We invite readers to test these scripts in any environment and compare your results to those we have published below. Results The table below shows results for the larger data set of five million records. Using SAS 9.4, the complete script takes on average 4 5,192 seconds (about 1.5 hours) to complete in the benchmark environment. The same tasks performed in Revolution R Enterprise 7 take 124 seconds (about two minutes) to complete. Table 2 shows the performance of SAS 9.4 and RRE 7 for each of the 10 components of the script. Table 2: Benchmark Results n = 5,000,000 Runtime (Seconds) RRE 7 Speed Task RRE 7 SAS 9.4 Multiple Descriptive statistics X Median and deciles X Frequency distribution X Linear regression with 20 numeric predictors X Linear regression with 20 mixed predictors X Stepwise linear regression, 100 numeric predictors X Logistic regression with 20 numeric predictors X Generalized linear model, 20 numeric predictors X k-means clustering, 20 active variables , X k-means clustering, 100 active variables , X Total, all tasks , X 4 Results averaged over multiple runs; no significant variation across runs. 4

5 RRE s performance advantage increases as the number of records analyzed increases, as shown below: Table 3: Results by Size of Analysis Data Set Total, All Tasks Runtime (Seconds) RRE 7 Speed Analysis File Size RRE 7 SAS 9.4 Multiple n = 1,000, X n = 5,000, , X The scoring test uses the predictive model produced in the first linear regression run and an independent table with 10 times as many rows as the analysis data set. The table below shows the results from this test: Table 4: Results for Scoring Scoring Task Runtime (Seconds) RRE 7 Speed Scoring File Size RRE 7 SAS 9.4 Multiple n = 10,000, X n = 50,000, X With SAS 9.4, SAS bundles HP PROCs designed for use with SAS High Performance Analytics Server. We substituted PROC HPREG for PROC REG in one of the tests. In the benchmark environment, use of this High Performance Analytics PROC does not materially improve SAS performance. 5 Table 5: Results for PROC HPREG Linear Regression Analysis File Size PROC REG Runtime (Seconds) PROC HPREG RRE 7 n = 5,000, According to SAS documentation, HP PROCs run in Single Machine mode unless the customer licenses SAS High Performance Analytics Server. 5

6 Discussion Speed matters. In the time it takes to run our benchmark script in SAS 9.4 under our test conditions, a user can perform the same tasks 42 times in RRE 7. Based on the outcomes of this test, in practice, an analyst using RRE 7 should be able to build more models, build better models by reducing the learning cycle, serve more customers and produce more revenue. Why does RRE 7 run faster than SAS 9.4? Revolution Analytics uses unique technology called Parallel External Memory Algorithms (PEMA) to distribute operations over multiple machines in a clustered architecture. When a data set is larger than memory on a single machine, Revolution R Enterprise 7 streams the data across all of the available computing resources. By contrast, SAS/STAT software swaps data between memory and disk when a data set is larger than memory, a process that is much slower than in-memory operations. When these SAS programs run in a grid configuration even when fully enabled for Grid operations most SAS PROCs do not take advantage of the available computing resources. According to SAS, only four 6 of the PROCs in SAS/STAT are able to take advantage of multiple computing threads. As we demonstrated by testing the HPREG PROC, the SAS HP PROCs do not improve performance unless the customer also licenses SAS High Performance Analytics Server. Revolution Analytics takes performance and efficiency seriously, and we continuously improve the efficiency and speed of our analytics engine. We invite everyone, customers and competitors alike, to run our benchmarks and share your results with us. 6 6

7 About Revolution Analytics, Inc. As the leading commercial provider of software and services based on the open source R project for statistical computing and headquartered in Mountain View, California Revolution Analytics brings Big Data scalability, performance and cross-platform enterprise readiness to R, the world s most widely used statistics software. The company s flagship Revolution R Enterprise software is designed to meet the production needs of leading organizations in data-driven industries, including finance, retail, manufacturing and digital media. Used by over 2 million data scientists in academia and at cuttingedge organizations such as Google, Lloyd s of London and the U.S. Food and Drug Administration, R is the standard of innovation in statistical analysis. Revolution Analytics is committed to supporting the continued growth of the R community by sponsoring R user groups and conferences worldwide, and the company offers free licenses for Revolution R Enterprise to everyone in academia Revolution Analytics, Inc. All rights reserved. Printed in the United States of America. This white paper is for informational purposes only. REVOLUTION ANALYTICS MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Revolution R Enterprise, RRE, Revolution Analytics, the Revolution Analytics and Revolution R Enterprise logos, and Big Data, Big Analytics Platform are trademarks or registered trademarks of Revolution Analytics, Inc., in the United States and/or other countries. Other product names mentioned herein may be trademarks of their respective companies.

Technical Paper. Performance of SAS In-Memory Statistics for Hadoop. A Benchmark Study. Allison Jennifer Ames Xiangxiang Meng Wayne Thompson

Technical Paper. Performance of SAS In-Memory Statistics for Hadoop. A Benchmark Study. Allison Jennifer Ames Xiangxiang Meng Wayne Thompson Technical Paper Performance of SAS In-Memory Statistics for Hadoop A Benchmark Study Allison Jennifer Ames Xiangxiang Meng Wayne Thompson Release Information Content Version: 1.0 May 20, 2014 Trademarks

More information

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011 Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis

More information

Driving Value from Big Data

Driving Value from Big Data Executive White Paper Driving Value from Big Data Bill Jacobs, Director of Product Marketing & Thomas W. Dinsmore, Director of Product Management Abstract Businesses are rapidly investing in Hadoop to

More information

Delivering Value from Big Data with Revolution R Enterprise and Hadoop

Delivering Value from Big Data with Revolution R Enterprise and Hadoop Executive White Paper Delivering Value from Big Data with Revolution R Enterprise and Hadoop Bill Jacobs, Director of Product Marketing Thomas W. Dinsmore, Director of Product Management October 2013 Abstract

More information

High Performance Predictive Analytics in R and Hadoop:

High Performance Predictive Analytics in R and Hadoop: High Performance Predictive Analytics in R and Hadoop: Achieving Big Data Big Analytics Presented by: Mario E. Inchiosa, Ph.D. US Chief Scientist August 27, 2013 1 Polling Questions 1 & 2 2 Agenda Revolution

More information

RevoScaleR Speed and Scalability

RevoScaleR Speed and Scalability EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution

More information

Scalability of the SAS/STAT HPGENSELECT High-Performance Analytical Procedure: A Comparison with RevoScaleR

Scalability of the SAS/STAT HPGENSELECT High-Performance Analytical Procedure: A Comparison with RevoScaleR Technical Paper Scalability of the SAS/STAT HPGENSELECT High-Performance Analytical Procedure: A Comparison with RevoScaleR Effectively implementing high-performance analytics software solutions in the

More information

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs 1.1 Introduction Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs For brevity, the Lavastorm Analytics Library (LAL) Predictive and Statistical Analytics Node Pack will be

More information

Understanding the Benefits of IBM SPSS Statistics Server

Understanding the Benefits of IBM SPSS Statistics Server IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster

More information

Integrated Grid Solutions. and Greenplum

Integrated Grid Solutions. and Greenplum EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving

More information

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report

More information

Table of Contents. June 2010

Table of Contents. June 2010 June 2010 From: StatSoft Analytics White Papers To: Internal release Re: Performance comparison of STATISTICA Version 9 on multi-core 64-bit machines with current 64-bit releases of SAS (Version 9.2) and

More information

SAS deployment on IBM Power servers with IBM PowerVM dedicated-donating LPARs

SAS deployment on IBM Power servers with IBM PowerVM dedicated-donating LPARs SAS deployment on IBM Power servers with IBM PowerVM dedicated-donating LPARs Narayana Pattipati IBM Systems and Technology Group ISV Enablement January 2013 Table of contents Abstract... 1 IBM PowerVM

More information

Unprecedented Performance and Scalability Demonstrated For Meter Data Management:

Unprecedented Performance and Scalability Demonstrated For Meter Data Management: Unprecedented Performance and Scalability Demonstrated For Meter Data Management: Ten Million Meters Scalable to One Hundred Million Meters For Five Billion Daily Meter Readings Performance testing results

More information

Delivering value from big data with Microsoft R Server and Hadoop

Delivering value from big data with Microsoft R Server and Hadoop EXECUTIVE WHITE PAPER Delivering value from big data with Microsoft R Server and Hadoop Microsoft Advanced Analytics Team April 2016 ABSTRACT Businesses are continuing to invest in Hadoop to manage analytic

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...

More information

HP reference configuration for entry-level SAS Grid Manager solutions

HP reference configuration for entry-level SAS Grid Manager solutions HP reference configuration for entry-level SAS Grid Manager solutions Up to 864 simultaneous SAS jobs and more than 3 GB/s I/O throughput Technical white paper Table of contents Executive summary... 2

More information

DATABASES AND ERP SELECTION: ORACLE VS SQL SERVER

DATABASES AND ERP SELECTION: ORACLE VS SQL SERVER WHITE PAPER DATABASES AND ERP SELECTION: ORACLE VS SQL SERVER Databases and ERP Selection: Oracle vs SQL Server By Rick Veague, Chief Technology Officer, IFS North America An enterprise application like

More information

A financial software company

A financial software company A financial software company Projecting USD10 million revenue lift with the IBM Netezza data warehouse appliance Overview The need A financial software company sought to analyze customer engagements to

More information

SAP HANA. SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence

SAP HANA. SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence SAP HANA SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence SAP HANA Performance Table of Contents 3 Introduction 4 The Test Environment Database Schema Test Data System

More information

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Please note the following IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice

More information

Using DeployR to Solve the R Integration Problem

Using DeployR to Solve the R Integration Problem DEPLOYR WHITE PAPER Using DeployR to olve the R Integration Problem By the Revolution Analytics DeployR Team March 2015 Introduction Organizations use analytics to empower decision making, often in real

More information

IBM SPSS Modeler Professional

IBM SPSS Modeler Professional IBM SPSS Modeler Professional Make better decisions through predictive intelligence Highlights Create more effective strategies by evaluating trends and likely outcomes. Easily access, prepare and model

More information

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION QlikView Scalability Center Technical Brief Series September 2012 qlikview.com Introduction This technical brief provides a discussion at a fundamental

More information

Cluster Computing at HRI

Cluster Computing at HRI Cluster Computing at HRI J.S.Bagla Harish-Chandra Research Institute, Chhatnag Road, Jhunsi, Allahabad 211019. E-mail: jasjeet@mri.ernet.in 1 Introduction and some local history High performance computing

More information

InfiniteGraph: The Distributed Graph Database

InfiniteGraph: The Distributed Graph Database A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086

More information

High Performance Time-Series Analysis Powered by Cutting-Edge Database Technology

High Performance Time-Series Analysis Powered by Cutting-Edge Database Technology High Performance Time-Series Analysis Powered by Cutting-Edge Database Technology Overview Country or Region: United Kingdom Industry: Financial Services Customer Profile builds data and analytics management

More information

Big data management with IBM General Parallel File System

Big data management with IBM General Parallel File System Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers

More information

IBM FlashSystem and Atlantis ILIO

IBM FlashSystem and Atlantis ILIO IBM FlashSystem and Atlantis ILIO Cost-effective, high performance, and scalable VDI Highlights Lower-than-PC cost Better-than-PC user experience Lower project risks Fast provisioning and better management

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

R and Hadoop: Architectural Options. Bill Jacobs VP Product Marketing & Field CTO, Revolution Analytics @bill_jacobs

R and Hadoop: Architectural Options. Bill Jacobs VP Product Marketing & Field CTO, Revolution Analytics @bill_jacobs R and Hadoop: Architectural Options Bill Jacobs VP Product Marketing & Field CTO, Revolution Analytics @bill_jacobs Polling Question #1: Who Are You? (choose one) Statistician or modeler who uses R Other

More information

Next-Generation Predictive Analytics. Research Report Executive Summary. Using Forward-Looking Insights to Gain Competitive Advantage.

Next-Generation Predictive Analytics. Research Report Executive Summary. Using Forward-Looking Insights to Gain Competitive Advantage. Next-Generation Predictive Analytics Using Forward-Looking Insights to Gain Competitive Advantage Research Report Executive Summary Sponsored by Copyright Ventana Research 2013 Do Not Redistribute Without

More information

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Integrating Apache Spark with an Enterprise Data Warehouse

Integrating Apache Spark with an Enterprise Data Warehouse Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software

More information

Working Together to Promote Business Innovations with Grid Computing

Working Together to Promote Business Innovations with Grid Computing IBM and SAS Working Together to Promote Business Innovations with Grid Computing A SAS White Paper Table of Contents Executive Summary... 1 Grid Computing Overview... 1 Benefits of Grid Computing... 1

More information

Revolution R Enterprise: Efficient Predictive Analytics for Big Data

Revolution R Enterprise: Efficient Predictive Analytics for Big Data Revolution R Enterprise: Efficient Predictive Analytics for Big Data Prepared for The Bloor Group August 2014 Bill Jacobs Director Product Marketing / Field CTO - Big Data Products bill.jacobs@revolutionanalytics.com

More information

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014 Maximierung des Geschäftserfolgs durch SAP Predictive Analytics Andreas Forster, May 2014 Legal Disclaimer The information in this presentation is confidential and proprietary to SAP and may not be disclosed

More information

JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra

JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra January 2014 Legal Notices Apache Cassandra, Spark and Solr and their respective logos are trademarks or registered trademarks

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc.

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc. How to Ingest Data into Google BigQuery using Talend for Big Data A Technical Solution Paper from Saama Technologies, Inc. July 30, 2013 Table of Contents Intended Audience What you will Learn Background

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 1 Market Trends Big Data Growing technology deployments are creating an exponential increase in the volume

More information

Dell* In-Memory Appliance for Cloudera* Enterprise

Dell* In-Memory Appliance for Cloudera* Enterprise Built with Intel Dell* In-Memory Appliance for Cloudera* Enterprise Find out what faster big data analytics can do for your business The need for speed in all things related to big data is an enormous

More information

Green HPC - Dynamic Power Management in HPC

Green HPC - Dynamic Power Management in HPC Gr eenhpc Dynami cpower Management i nhpc AT ECHNOL OGYWHI T EP APER Green HPC Dynamic Power Management in HPC 2 Green HPC - Dynamic Power Management in HPC Introduction... 3 Green Strategies... 4 Implementation...

More information

High-Performance Analytics

High-Performance Analytics High-Performance Analytics David Pope January 2012 Principal Solutions Architect High Performance Analytics Practice Saturday, April 21, 2012 Agenda Who Is SAS / SAS Technology Evolution Current Trends

More information

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER Hur hanterar vi utmaningar inom området - Big Data Jan Östling Enterprise Technologies Intel Corporation, NER Legal Disclaimers All products, computer systems, dates, and figures specified are preliminary

More information

Qlik Sense scalability

Qlik Sense scalability Qlik Sense scalability Visual analytics platform Qlik Sense is a visual analytics platform powered by an associative, in-memory data indexing engine. Based on users selections, calculations are computed

More information

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1 CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level -ORACLE TIMESTEN 11gR1 CASE STUDY Oracle TimesTen In-Memory Database and Shared Disk HA Implementation

More information

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage Technical white paper Table of contents Executive summary... 2 Introduction... 2 Test methodology... 3

More information

R Tools Evaluation. A review by Analytics @ Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015

R Tools Evaluation. A review by Analytics @ Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015 R Tools Evaluation A review by Analytics @ Global BI / Local & Regional Capabilities Telefónica CCDO May 2015 R Features What is? Most widely used data analysis software Used by 2M+ data scientists, statisticians

More information

Fast Analytics on Big Data with H20

Fast Analytics on Big Data with H20 Fast Analytics on Big Data with H20 0xdata.com, h2o.ai Tomas Nykodym, Petr Maj Team About H2O and 0xdata H2O is a platform for distributed in memory predictive analytics and machine learning Pure Java,

More information

In-Database Analytics Deep Dive with Teradata and Revolution R

In-Database Analytics Deep Dive with Teradata and Revolution R In-Database Analytics Deep Dive with Teradata and Revolution R Mario Inchiosa Chief Scientist, Revolution Analytics Tim Miller Partner Integration Lab, Teradata Agenda Introduction Revolution R Enterprise

More information

FLOW-3D Performance Benchmark and Profiling. September 2012

FLOW-3D Performance Benchmark and Profiling. September 2012 FLOW-3D Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: FLOW-3D, Dell, Intel, Mellanox Compute

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

WHAT S NEW IN SAS 9.4

WHAT S NEW IN SAS 9.4 WHAT S NEW IN SAS 9.4 PLATFORM, HPA & SAS GRID COMPUTING MICHAEL GODDARD CHIEF ARCHITECT SAS INSTITUTE, NEW ZEALAND SAS 9.4 WHAT S NEW IN THE PLATFORM Platform update SAS Grid Computing update Hadoop support

More information

LOAD BALANCING 2X APPLICATIONSERVER XG SECURE CLIENT GATEWAYS THROUGH MICROSOFT NETWORK LOAD BALANCING

LOAD BALANCING 2X APPLICATIONSERVER XG SECURE CLIENT GATEWAYS THROUGH MICROSOFT NETWORK LOAD BALANCING SECURE CLIENT GATEWAYS THROUGH MICROSOFT NETWORK LOAD BALANCING Contents Introduction... 3 Network Diagram... 3 Installing NLB... 3-4 Configuring NLB... 4-8 Configuring 2X Secure Client Gateway... 9 About

More information

SQL Server 2012 Performance White Paper

SQL Server 2012 Performance White Paper Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.

More information

Kronos Workforce Central 6.1 with Microsoft SQL Server: Performance and Scalability for the Enterprise

Kronos Workforce Central 6.1 with Microsoft SQL Server: Performance and Scalability for the Enterprise Kronos Workforce Central 6.1 with Microsoft SQL Server: Performance and Scalability for the Enterprise Providing Enterprise-Class Performance and Scalability and Driving Lower Customer Total Cost of Ownership

More information

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform David Lawler, Oracle Senior Vice President, Product Management and Strategy Paul Kent, SAS Vice President, Big Data What

More information

Data Center Solutions

Data Center Solutions Data Center Solutions Systems, software and hardware solutions you can trust With over 25 years of storage innovation, SanDisk is a global flash technology leader. At SanDisk, we re expanding the possibilities

More information

Performance and Scalability Overview

Performance and Scalability Overview Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and

More information

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

White Paper. How Streaming Data Analytics Enables Real-Time Decisions White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream

More information

JBoss Data Grid Performance Study Comparing Java HotSpot to Azul Zing

JBoss Data Grid Performance Study Comparing Java HotSpot to Azul Zing JBoss Data Grid Performance Study Comparing Java HotSpot to Azul Zing January 2014 Legal Notices JBoss, Red Hat and their respective logos are trademarks or registered trademarks of Red Hat, Inc. Azul

More information

Decision Trees built in Hadoop plus more Big Data Analytics with Revolution R Enterprise

Decision Trees built in Hadoop plus more Big Data Analytics with Revolution R Enterprise Decision Trees built in Hadoop plus more Big Data Analytics with Revolution R Enterprise Revolution Webinar April 17, 2014 Mario Inchiosa, US Chief Scientist mario.inchiosa@revolutionanalytics.com All

More information

Intellicus Enterprise Reporting and BI Platform

Intellicus Enterprise Reporting and BI Platform Intellicus Cluster and Load Balancer Installation and Configuration Manual Intellicus Enterprise Reporting and BI Platform Intellicus Technologies info@intellicus.com www.intellicus.com Copyright 2012

More information

Architectures for Big Data Analytics A database perspective

Architectures for Big Data Analytics A database perspective Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

IBM PureFlex and Atlantis ILIO: Cost-effective, high-performance and scalable persistent VDI

IBM PureFlex and Atlantis ILIO: Cost-effective, high-performance and scalable persistent VDI IBM PureFlex and Atlantis ILIO: Cost-effective, high-performance and scalable persistent VDI Highlights Lower than PC cost: saves hundreds of dollars per desktop, as storage capacity and performance requirements

More information

Get More Scalability and Flexibility for Big Data

Get More Scalability and Flexibility for Big Data Solution Overview LexisNexis High-Performance Computing Cluster Systems Platform Get More Scalability and Flexibility for What You Will Learn Modern enterprises are challenged with the need to store and

More information

PARALLELS CLOUD STORAGE

PARALLELS CLOUD STORAGE PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...

More information

Colgate-Palmolive selects SAP HANA to improve the speed of business analytics with IBM and SAP

Colgate-Palmolive selects SAP HANA to improve the speed of business analytics with IBM and SAP selects SAP HANA to improve the speed of business analytics with IBM and SAP Founded in 1806, is a global consumer products company which sells nearly $17 billion annually in personal care, home care,

More information

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Report Date: 27, April 2015. www.iomark.

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Report Date: 27, April 2015. www.iomark. IOmark- VDI HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Copyright 2010-2014 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VM, VDI- IOmark, and IOmark

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

An In-Depth Look at In-Memory Predictive Analytics for Developers

An In-Depth Look at In-Memory Predictive Analytics for Developers September 9 11, 2013 Anaheim, California An In-Depth Look at In-Memory Predictive Analytics for Developers Philip Mugglestone SAP Learning Points Understand the SAP HANA Predictive Analysis library (PAL)

More information

Make Better Decisions Through Predictive Intelligence

Make Better Decisions Through Predictive Intelligence IBM SPSS Modeler Professional Make Better Decisions Through Predictive Intelligence Highlights Easily access, prepare and model structured data with this intuitive, visual data mining workbench Rapidly

More information

CUSTOMER Presentation of SAP Predictive Analytics

CUSTOMER Presentation of SAP Predictive Analytics SAP Predictive Analytics 2.0 2015-02-09 CUSTOMER Presentation of SAP Predictive Analytics Content 1 SAP Predictive Analytics Overview....3 2 Deployment Configurations....4 3 SAP Predictive Analytics Desktop

More information

Scalable Machine Learning - or what to do with all that Big Data infrastructure

Scalable Machine Learning - or what to do with all that Big Data infrastructure - or what to do with all that Big Data infrastructure TU Berlin blog.mikiobraun.de Strata+Hadoop World London, 2015 1 Complex Data Analysis at Scale Click-through prediction Personalized Spam Detection

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.

More information

White Paper. Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics

White Paper. Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics White Paper Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics Contents Self-service data discovery and interactive predictive analytics... 1 What does

More information

Software-defined Storage Architecture for Analytics Computing

Software-defined Storage Architecture for Analytics Computing Software-defined Storage Architecture for Analytics Computing Arati Joshi Performance Engineering Colin Eldridge File System Engineering Carlos Carrero Product Management June 2015 Reference Architecture

More information

Best Practices for Data Sharing in a Grid Distributed SAS Environment. Updated July 2010

Best Practices for Data Sharing in a Grid Distributed SAS Environment. Updated July 2010 Best Practices for Data Sharing in a Grid Distributed SAS Environment Updated July 2010 B E S T P R A C T I C E D O C U M E N T Table of Contents 1 Abstract... 2 1.1 Storage performance is critical...

More information

Qlik Sense Enabling the New Enterprise

Qlik Sense Enabling the New Enterprise Technical Brief Qlik Sense Enabling the New Enterprise Generations of Business Intelligence The evolution of the BI market can be described as a series of disruptions. Each change occurred when a technology

More information

R is Ready for Business

R is Ready for Business is eady for Business evolution 6 High- Analytics for the evolution is production grade analytics software built upon the powerful open source statistics language. With commercial enhancements and professional

More information

SAS Business Analytics. Base SAS for SAS 9.2

SAS Business Analytics. Base SAS for SAS 9.2 Performance & Scalability of SAS Business Analytics on an NEC Express5800/A1080a (Intel Xeon 7500 series-based Platform) using Red Hat Enterprise Linux 5 SAS Business Analytics Base SAS for SAS 9.2 Red

More information

Key Messages of Enterprise Cluster NAS Huawei OceanStor N8500

Key Messages of Enterprise Cluster NAS Huawei OceanStor N8500 Messages of Enterprise Cluster NAS Huawei OceanStor Messages of Enterprise Cluster NAS 1. High performance and high reliability, addressing bid data challenges High performance: In the SPEC benchmark test,

More information

Laurence Liew General Manager, APAC. Economics Is Driving Big Data Analytics to the Cloud

Laurence Liew General Manager, APAC. Economics Is Driving Big Data Analytics to the Cloud Laurence Liew General Manager, APAC Economics Is Driving Big Data Analytics to the Cloud Big Data 101 The Analytics Stack Economics of Big Data Convergence of the 3 forces Big Data Analytics in the Cloud

More information

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance

More information

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5

More information

Access Control In Virtual Environments

Access Control In Virtual Environments In Virtual Environments A FoxT White Paper Rapid growth in the use of virtualization tools means system administrators are now able to isolate processes in exclusive run-time environments. While helping

More information

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata Up Your R Game James Taylor, Decision Management Solutions Bill Franks, Teradata Today s Speakers James Taylor Bill Franks CEO Chief Analytics Officer Decision Management Solutions Teradata 7/28/14 3 Polling

More information

Actian SQL in Hadoop Buyer s Guide

Actian SQL in Hadoop Buyer s Guide Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop

More information

Innovative technology for big data analytics

Innovative technology for big data analytics Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of

More information

Impact of Big Data growth On Transparent Computing

Impact of Big Data growth On Transparent Computing Impact of Big Data growth On Transparent Computing Michael A. Greene Intel Vice President, Software and Services Group, General Manager, System Technologies and Optimization 1 Transparent Computing (TC)

More information