Table of Contents. June 2010

Size: px
Start display at page:

Download "Table of Contents. June 2010"

Transcription

1 June 2010 From: StatSoft Analytics White Papers To: Internal release Re: Performance comparison of STATISTICA Version 9 on multi-core 64-bit machines with current 64-bit releases of SAS (Version 9.2) and PASW (formerly SPSS) Statistics Version 18; basic data management, basic statistics, and aggregation operations Note:. Table of Contents Abstract... 2 Overview... 2 Ability to Take Advantage of the Most Recent High-End Hardware... 2 Different Performance Priorities across Platforms... 2 Multithreading in SAS... 3 Multithreading in PASW (SPSS)... 3 Choosing Algorithms and Use Cases to Compare... 3 Data File Sizes, Buffering Mechanisms (Best Achieved Performance), and Repeatability... 4 Methods... 4 Hardware... 4 Software... 4 Data... 5 Data Management Operations, Data Analysis Operations... 5 Results... 6 Conclusion... 7

2 Performance Comparison of STATISTICA 9 (64-bit) vs. SAS, PASW (SPSS); 2010 Page 2 of 7 Abstract The purpose of this document is to provide a snapshot of key comparisons regarding the speed and efficiency of STATISTICA Version 9 data management and basic statistical computations, with reference to the performance of key competitors in the market place, namely SAS Version 9.2 and PASW (formerly SPSS) Statistics Version 18. This "snapshot" was taken in May 2010, after the most recent releases of 64-bit versions of SAS and PASW became available for Microsoft Windows based 64-bit multi-core computers (and servers). This comparison does not compare the speed of advanced analytic computations or data mining algorithms, because those comparisons are in practice almost impossible to prepare in a methodologically clean and meaningful manner: They inevitably will be confounded by the selection of specific (and incompatible between the three programs) options and implementation details, default statistics and results that are reported, and so on. However, generally speaking, the results of the comparisons regarding advanced analytics and data mining procedures mirror those reported here for basic statistics (descriptives, correlations, and data management operations). In order to measure the effectiveness of the basic implementation of multithreading into the STATISTICA Version 9 architecture, the most common and basic data management and basic statistics were chosen for the comparisons, where the performance capabilities (and limitations) of basic data reading and processing operations can be compared unambiguously. In summary, in the majority of test cases, STATISTICA Version 9 significantly exceeds the performance of the competing computing platforms. Overview The comparison of the performance of statistical analysis platforms is an inherently difficult and somewhat ambiguous task. Specifically, the following issues and challenges have to be considered. Ability to Take Advantage of the Most Recent High-End Hardware As of this writing (June 2010) multi-core 64-bit laptop, desktop, and Windows server computers are commonly available for the fraction of the price of analytics software that can take advantage of this hardware. In fact, on the Windows operating system, STATISTICA, PASW (formerly SPSS), and SAS only recently became available for 64-bit computing platforms, and multithreading capabilities (to take advantage of multiple processor cores, to simultaneously work on the same computational task) have been introduced to different specific operations only in the most current releases of these software packages. Therefore, in order to construct fair comparisons, the most recent versions of the competing packages for 64-bit platforms were installed on identical hardware for this comparison. Different Performance Priorities across Platforms The goal of the next generation of computational optimizations and dynamic multithreading support in STATISTICA Version 9 was to further improve the speed of the specific operations where customers had told us that performance was critical. Therefore, emphasis was placed on: 1. Data management operations (data reading, sub-setting, etc.)

3 Performance Comparison of STATISTICA 9 (64-bit) vs. SAS, PASW (SPSS); 2010 Page 3 of 7 2. Data aggregation (basic statistical summaries) and other basic statistics 3. Key data mining and predictive modeling algorithms commonly applied to large data sets (Classification and Regression Trees, Stochastic Gradient Boosting (Boosted Trees), Random Forests (Tree Nets), Generalized Linear models, etc.) Multithreading in SAS According to the most recent documentation, multithreading is not implemented in most of the predictive modeling algorithms in SAS (with the exception of linear models). Sorting, basic data reading operations, and basic statistical summaries (e.g., Proc Means), however, will take advantage of multiple processors (multithreading). Multithreading in PASW (SPSS), Availability for 64-Bit Platforms According to the most recent documentation, multithreading is not implemented in most predictive modeling algorithms, except Regression, CSLogistic, and CSCoxReg, and Distriminant. Also, a few other methods including Correlations and Factor have been re-written to take advantage of multiple processor cores. It appears (as of this writing), that the computation of basic aggregate statistics (means) and other data management operations are not multithreaded. PASW Modeler (SPSS Clementine) not available for Windows 64-bit platform. At the time of this review, the data mining solution from IBM (PASW Modeler, formerly known as SPSS Clementine, which is a different application than the SPSS core software) is not available for 64-bit Windows platforms. Choosing Algorithms and Use Cases to Compare Given these differences in priorities (i.e., what specific data management and computational algorithms have been upgraded in the SAS and IBM/SPSS software to take advantage of modern multi-core hardware), as well as the methodological concerns regarding how to achieve the greatest comparability of test cases (given the diverse options offered in the three applications) mentioned above in the Overview, it was decided to focus in this review only on the most common statistics, data management and data aggregation and analysis methods. For example, it is currently not possible to evaluate the effectiveness of STATISTICA Data Miner algorithms (e.g., of Boosted Trees to process multi-gigabyte input files) because these capabilities do not exist in other platforms. Also, specific features and implementation details of, for example, PASW (SPSS) CSLOGISTIC or CSSELECT (for probability sampling) or SAS Proc GenMod are sufficiently different from the respective implementations in STATISTICA Version 9 that it is difficult to compare "speed of computations." For example, to fit a Poisson regression model in SAS with predictor designs to large datasets can (a) be a bit faster, (b) take about the same amount of time, or (c) take substantially longer than in STATISTICA Generalized Linear Models, depending on the parameterization of the model and other options that are chosen; in PASW (SPSS) this analysis must be performed via the GENLIN procedure, which currently (as of this writing) is not multithreaded, so the comparison would be somewhat "unfair." Also, it is currently not possible to evaluate the effectiveness of many of the STATISTICA Data Miner core algorithms (e.g., of Boosted Trees to process multi-gigabyte input files) because these capabilities do not exist on other platforms.

4 Performance Comparison of STATISTICA 9 (64-bit) vs. SAS, PASW (SPSS); 2010 Page 4 of 7 Data File Sizes, Buffering Mechanisms (Best Achieved Performance), and Repeatability Various comparisons were performed using data sets of various sizes and dimensions (rows, columns). Preliminary tests confirmed that the speed differences between different platforms were mostly a function of the data file size, and not the numbers of rows or columns of the data file. Further, it is common (in STATISTICA and other platforms) that a substantial amount of processing time is spent simply reading the input data into the respective data analysis application (e.g., simply copying a file of the size of several gigabytes will take time that depends mostly on the performance of the hard disk). To minimize the effect of hard disk access times, which could mask real performance differences in the speed with which different computing platforms perform basic data management and analysis operations, multiple runs where performed in each case. The Windows Vista 64 operating system on the testing computer with 8 Gigabytes of RAM that was used in these tests performed buffering of the data, so that consecutive runs of the same analyses became faster as data was already buffered for reading (in memory) for more efficient access. STATISTICA 9 and SAS both use efficient mechanisms to leverage those buffering mechanisms available on Windows platforms to improve performance on consecutive runs with the same data. In practice, this is an important opportunity to improve the speed of basic data management operations for ETL applications with very large datasets, because many such operations do indeed access the same data multiple times in consecutive analysis and/or data transformation passes. Therefore, in all cases, the reported times in this White Paper compare the "best achieved performance" in consecutive runs, to gage the capabilities of the respective platforms for real-life data management and ETL operations. Note that the "worst-achieved performance" differences were sometimes smaller, because the bottleneck of the operations (e.g., to compute means and other moment statistics) was the speed with which the system could access data from the hard disk, while the respective computational algorithms were not used to full capacity, thus blurring the differences between data analysis platforms. Hardware Methods All tests were performed on a 2.2 GHz Quadcore Dell computer, running Windows Vista Professional for 64-bit platforms. The system had 8 GB of RAM. Software The comparisons were performed using the most recent releases of the respective software, at the time of this writing: 1. STATISTICA Version 9.1a, for 64-bit systems 2. SAS Version 9.2 for 64-Bit systems 3. PASW (SPSS) Statistics Version 18 (18.0.0) for 64-bit systems

5 Performance Comparison of STATISTICA 9 (64-bit) vs. SAS, PASW (SPSS); 2010 Page 5 of 7 Data All comparisons were performed with: 1. a dataset with 30 variables and 9,000,000 cases (approximately 2.2 Gigabytes), and 2. a dataset with 500 variables with 1,000,000 cases (approximately 4 Gigabytes). In STATISTICA, these files have a size of approximately 2.2 Gigabytes and 4 Gigabytes, respectively. These files were exported to text format, and then imported into the respective comparison analysis platforms (SAS 64 bit, PASW [SPSS]). The file sizes in those other platforms were comparable with the size of the STATISTICA data file. Data Management Operations, Data Analysis Operations The following 4 key tests were performed on all three data analysis platforms: 1. Computing basic descriptive statistics (means, standard deviations) for all variables 2. Computing correlation matrix using casewise (listwise) deletion of missing data, for all variables or a subset of variables 3. Sorting the file by two keys into descending order; Key 1 contains 50% 1's and 50% 2's, Key 2 contains uniform random numbers 4. Creating a subset of approximately 25% of the data using logical case selection conditions involving two variables For the reasons described in the Overview section, and also because the multithreading support (or even support for 64-bit platforms) is not uniformly implemented or available across the analysis platforms in this test, these basic tests were chosen to provide as accurate a picture as possible of the "real capabilities" with respect to basic data management and data analysis (aggregation) operations. However, it should be noted that the complete testing process involved many additional tests that yielded generally comparable results (e.g., for logistic regression, general and generalized linear models, etc. when compared to SAS which also offers a multithreaded implementation of this procedure) to those reported here.

6 Performance Comparison of STATISTICA 9 (64-bit) vs. SAS, PASW (SPSS); 2010 Page 6 of 7 Results In the following table, all results are reported as the number of seconds required to finish the respective task. Test/Task Notes Basic Descriptive Statistics 1 (30 variables, 9,000,000 cases) Basic Descriptive Statistics (500 variables, 1,000,000 cases) Correlation matrix, casewise (listwise) deletion of missing data 2, 30 variables, 9,000,000 cases STATISTICA ver. 9.1a 64 Bit SAS ver Bit PASW/SPSS ver Bit Correlation matrix, casewise (listwise) deletion of missing data, 200 variables, 1,000,000 cases Sorting 9,000,000 Cases, 30 Variables; 2 keys descending Subsetting 2.2 gig file; extracting approx. 25% of the data using a compound logical case selection condition In SAS, Proc Means will take advantage of multiple processors 2 In SPSS, Correlations will take advantage of multiple processors All tests performed after processing data at access to active data All tests performed after processing data at access to active data All tests performed after processing data at access to active data; verified that Threads=4 in PASW (SPSS) All tests performed after processing data at access to active data; verified that Threads=4 in PASW (SPSS) Includes making file ready for analysis, i.e., actually creating it. 3 Speed varied widely depending on the initial sort order of the input file, and whether the file is sorted into ascending or descending order

7 Performance Comparison of STATISTICA 9 (64-bit) vs. SAS, PASW (SPSS); 2010 Page 7 of 7 Conclusion With the exception of the sorting-of-a-large-dataset task (where STATISTICA's score is lower, most likely because its general sorting engine is optimized for more complex data sets), the performance of STATISTICA 9.1a exceeds significantly that of SAS and IBM PASW (SPSS) on all of the key data management and aggregation (summarization) tasks. In summary, STATISTICA Version 9.1a when implemented on a 64-bit multi-core platform offers significant performance gains over previous releases, and over the major competing software platforms SAS 9.2 and IBM PASW (SPSS) 18. To reiterate, this comparison did not include the various algorithms for predictive modeling (data mining) available in STATSITICA Data Miner, most of which are now also implemented via highly optimized, multithreaded algorithms that can take advantage of modern multi-core hardware (e.g., C&RT, Boosted Trees, Random Forests, Generalized Linear Models). Our tests with STATISTICA Data Miner to date have shown significant performance advantages: For example, using a real-world data set from a complex manufacturing application with 320 predictors and 278,000 cases (over 700 MB) can be analyzed with the Classification and Regression Trees procedure in just 83 seconds after an initial pass through the data; using the Stochastic Gradient Boosting procedure (Boosted Trees) with this file (building hundreds of trees), the predictive modeling analysis completes in under 6 minutes. The results reported here also do not reflect the substantial capabilities for parallel processing available on the STATISTICA server platform (WebSTATISTICA), which offers load balancing technology and takes advantage of not only multiple CPU's but also CPU's distributed across servers and clusters of computers. However, internal testing at StatSoft has documented the capabilities of the WebSTATISTICA server platform, for example as the analytics server platform hosting the scoring engine for the STATISTICA Live Score solution for credit and insurance applications. When combined with the performance for data management and data analysis operations documented here, STATISTICA Live Score provides substantial capabilities even when implemented on a single multi-core server and flexibly scales to clusters of servers (WebSTATISTICA manages load balancing across multiple servers). For more information, please contact StatSoft or visit

Understanding the Benefits of IBM SPSS Statistics Server

Understanding the Benefits of IBM SPSS Statistics Server IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster

More information

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011 Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis

More information

The Predictive Data Mining Revolution in Scorecards:

The Predictive Data Mining Revolution in Scorecards: January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

SQL Server 2005 Features Comparison

SQL Server 2005 Features Comparison Page 1 of 10 Quick Links Home Worldwide Search Microsoft.com for: Go : Home Product Information How to Buy Editions Learning Downloads Support Partners Technologies Solutions Community Previous Versions

More information

Data Mining: STATISTICA

Data Mining: STATISTICA Data Mining: STATISTICA Outline Prepare the data Classification and regression 1 Prepare the Data Statistica can read from Excel,.txt and many other types of files Compared with WEKA, Statistica is much

More information

RevoScaleR Speed and Scalability

RevoScaleR Speed and Scalability EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution

More information

What s New in SPSS 16.0

What s New in SPSS 16.0 SPSS 16.0 New capabilities What s New in SPSS 16.0 SPSS Inc. continues its tradition of regularly enhancing this family of powerful but easy-to-use statistical software products with the release of SPSS

More information

IBM SPSS Statistics Performance Best Practices

IBM SPSS Statistics Performance Best Practices IBM SPSS Statistics IBM SPSS Statistics Performance Best Practices Contents Overview... 3 Target User... 3 Introduction... 3 Methods of Problem Diagnosis... 3 Performance Logging for Statistics Server...

More information

How To Test The Performance Of An Ass 9.4 And Sas 7.4 On A Test On A Powerpoint Powerpoint 9.2 (Powerpoint) On A Microsoft Powerpoint 8.4 (Powerprobe) (

How To Test The Performance Of An Ass 9.4 And Sas 7.4 On A Test On A Powerpoint Powerpoint 9.2 (Powerpoint) On A Microsoft Powerpoint 8.4 (Powerprobe) ( White Paper Revolution R Enterprise: Faster Than SAS Benchmarking Results by Thomas W. Dinsmore and Derek McCrae Norton In analytics, speed matters. How much? We asked the director of analytics from a

More information

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA Welcome Xindong Wu Data Mining: Updates in Technologies Dept of Math and Computer Science Colorado School of Mines Golden, Colorado 80401, USA Email: xwu@ mines.edu Home Page: http://kais.mines.edu/~xwu/

More information

Benchmarking of different classes of models used for credit scoring

Benchmarking of different classes of models used for credit scoring Benchmarking of different classes of models used for credit scoring We use this competition as an opportunity to compare the performance of different classes of predictive models. In particular we want

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies

More information

SAS Grid Manager Testing and Benchmarking Best Practices for SAS Intelligence Platform

SAS Grid Manager Testing and Benchmarking Best Practices for SAS Intelligence Platform SAS Grid Manager Testing and Benchmarking Best Practices for SAS Intelligence Platform INTRODUCTION Grid computing offers optimization of applications that analyze enormous amounts of data as well as load

More information

INTRODUCTION TO WINDOWS 7

INTRODUCTION TO WINDOWS 7 INTRODUCTION TO WINDOWS 7 Windows 7 Editions There are six different Windows 7 editions: Starter Home Basic Home Premium Professional Enterprise Ultimate Starter Windows 7 Starter edition does not support

More information

C:\Users\<your_user_name>\AppData\Roaming\IEA\IDBAnalyzerV3

C:\Users\<your_user_name>\AppData\Roaming\IEA\IDBAnalyzerV3 Installing the IDB Analyzer (Version 3.1) Installing the IDB Analyzer (Version 3.1) A current version of the IDB Analyzer is available free of charge from the IEA website (http://www.iea.nl/data.html,

More information

How To Make A Credit Risk Model For A Bank Account

How To Make A Credit Risk Model For A Bank Account TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

More information

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users. Bonus Chapter Ten Major Predictive Analytics Vendors In This Chapter Angoss FICO IBM RapidMiner Revolution Analytics Salford Systems SAP SAS StatSoft, Inc. TIBCO This chapter highlights ten of the major

More information

GeoImaging Accelerator Pansharp Test Results

GeoImaging Accelerator Pansharp Test Results GeoImaging Accelerator Pansharp Test Results Executive Summary After demonstrating the exceptional performance improvement in the orthorectification module (approximately fourteen-fold see GXL Ortho Performance

More information

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs 1.1 Introduction Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs For brevity, the Lavastorm Analytics Library (LAL) Predictive and Statistical Analytics Node Pack will be

More information

Develop Predictive Models Using Your Business Expertise

Develop Predictive Models Using Your Business Expertise Clementine 8.5 Specifications Develop Predictive Models Using Your Business Expertise Clementine is an integrated data mining workbench, popular worldwide with data miners and business analysts alike.

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

High Performance Predictive Analytics in R and Hadoop:

High Performance Predictive Analytics in R and Hadoop: High Performance Predictive Analytics in R and Hadoop: Achieving Big Data Big Analytics Presented by: Mario E. Inchiosa, Ph.D. US Chief Scientist August 27, 2013 1 Polling Questions 1 & 2 2 Agenda Revolution

More information

IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

More information

Architectures for Big Data Analytics A database perspective

Architectures for Big Data Analytics A database perspective Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum

More information

IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

More information

Application of Predictive Analytics for Better Alignment of Business and IT

Application of Predictive Analytics for Better Alignment of Business and IT Application of Predictive Analytics for Better Alignment of Business and IT Boris Zibitsker, PhD bzibitsker@beznext.com July 25, 2014 Big Data Summit - Riga, Latvia About the Presenter Boris Zibitsker

More information

Distribution One Server Requirements

Distribution One Server Requirements Distribution One Server Requirements Introduction Welcome to the Hardware Configuration Guide. The goal of this guide is to provide a practical approach to sizing your Distribution One application and

More information

Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

DHL Data Mining Project. Customer Segmentation with Clustering

DHL Data Mining Project. Customer Segmentation with Clustering DHL Data Mining Project Customer Segmentation with Clustering Timothy TAN Chee Yong Aditya Hridaya MISRA Jeffery JI Jun Yao 3/30/2010 DHL Data Mining Project Table of Contents Introduction to DHL and the

More information

Hardware and Software Requirements for Sage 50 v15 to v22

Hardware and Software Requirements for Sage 50 v15 to v22 Hardware and Software Requirements for Sage 50 v15 to v22 Sage 50 Accounts v22 Note: The Sage Data Service has the same system requirements as Sage 50 Accounts v22. Note: For more information on support

More information

Operationalise Predictive Analytics

Operationalise Predictive Analytics Operationalise Predictive Analytics Publish SPSS, Excel and R reports online Predict online using SPSS and R models Access models and reports via Android app Organise people and content into projects Monitor

More information

Licenze Microsoft SQL Server 2005

Licenze Microsoft SQL Server 2005 Versione software Licenze Microsoft SQL Server 2005 Noleggio/mese senza assistenza sistemistica Noleggio/mese CON assistenza sistemistica SQL Server Express 0,00+Iva da preventivare SQL Server Workgroup

More information

7.x Upgrade Instructions. 2015 Software Pursuits, Inc.

7.x Upgrade Instructions. 2015 Software Pursuits, Inc. 7.x Upgrade Instructions 2015 Table of Contents INTRODUCTION...2 SYSTEM REQUIREMENTS FOR SURESYNC 7...2 CONSIDERATIONS BEFORE UPGRADING...3 TERMINOLOGY CHANGES... 4 Relation Renamed to Job... 4 SPIAgent

More information

CUSTOMER Presentation of SAP Predictive Analytics

CUSTOMER Presentation of SAP Predictive Analytics SAP Predictive Analytics 2.0 2015-02-09 CUSTOMER Presentation of SAP Predictive Analytics Content 1 SAP Predictive Analytics Overview....3 2 Deployment Configurations....4 3 SAP Predictive Analytics Desktop

More information

2009 Oracle Corporation 1

2009 Oracle Corporation 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

White Paper February 2010. IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario

White Paper February 2010. IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario White Paper February 2010 IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario 2 Contents 5 Overview of InfoSphere DataStage 7 Benchmark Scenario Main Workload

More information

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.

More information

Predictive Modeling and Big Data

Predictive Modeling and Big Data Predictive Modeling and Presented by Eileen Burns, FSA, MAAA Milliman Agenda Current uses of predictive modeling in the life insurance industry Potential applications of 2 1 June 16, 2014 [Enter presentation

More information

Report Paper: MatLab/Database Connectivity

Report Paper: MatLab/Database Connectivity Report Paper: MatLab/Database Connectivity Samuel Moyle March 2003 Experiment Introduction This experiment was run following a visit to the University of Queensland, where a simulation engine has been

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...

More information

Chapter 7: Data Mining

Chapter 7: Data Mining Chapter 7: Data Mining Overview Topics discussed: The Need for Data Mining and Business Value The Data Mining Process: Define Business Objectives Get Raw Data Identify Relevant Predictive Variables Gain

More information

Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator

Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator WHITE PAPER Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com SAS 9 Preferred Implementation Partner tests a single Fusion

More information

SUBJECT: SOLIDWORKS HARDWARE RECOMMENDATIONS - 2013 UPDATE

SUBJECT: SOLIDWORKS HARDWARE RECOMMENDATIONS - 2013 UPDATE SUBJECT: SOLIDWORKS RECOMMENDATIONS - 2013 UPDATE KEYWORDS:, CORE, PROCESSOR, GRAPHICS, DRIVER, RAM, STORAGE SOLIDWORKS RECOMMENDATIONS - 2013 UPDATE Below is a summary of key components of an ideal SolidWorks

More information

System Requirements. SAS Profitability Management 2.21. Deployment

System Requirements. SAS Profitability Management 2.21. Deployment System Requirements SAS Profitability Management 2.2 This document provides the requirements for installing and running SAS Profitability Management. You must update your computer to meet the minimum requirements

More information

Verizon Security Scan Powered by McAfee. Installation Guide for Home Users

Verizon Security Scan Powered by McAfee. Installation Guide for Home Users Verizon Security Scan Powered by McAfee Installation Guide for Home Users ii Contents Introduction 3 System requirements 5 Installing Security Scan 7 Downloading your software... 7 Download your software...

More information

QLIKVIEW ARCHITECTURE AND SYSTEM RESOURCE USAGE

QLIKVIEW ARCHITECTURE AND SYSTEM RESOURCE USAGE QLIKVIEW ARCHITECTURE AND SYSTEM RESOURCE USAGE QlikView Technical Brief April 2011 www.qlikview.com Introduction This technical brief covers an overview of the QlikView product components and architecture

More information

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1 Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System

More information

High-Volume Data Warehousing in Centerprise. Product Datasheet

High-Volume Data Warehousing in Centerprise. Product Datasheet High-Volume Data Warehousing in Centerprise Product Datasheet Table of Contents Overview 3 Data Complexity 3 Data Quality 3 Speed and Scalability 3 Centerprise Data Warehouse Features 4 ETL in a Unified

More information

IBM Lotus Notes and Lotus inotes 8.5.2 on Citrix XenApp 4.5/5.0: A scalability analysis

IBM Lotus Notes and Lotus inotes 8.5.2 on Citrix XenApp 4.5/5.0: A scalability analysis IBM Lotus Notes and Lotus inotes 8.5.2 on Citrix XenApp 4.5/5.0: A scalability analysis Gary Denner IBM Software Group IBM Collaboration Solutions Technical Lead - Lotus Domino SVT Mulhuddart, Ireland

More information

Drivers to support the growing business data demand for Performance Management solutions and BI Analytics

Drivers to support the growing business data demand for Performance Management solutions and BI Analytics Drivers to support the growing business data demand for Performance Management solutions and BI Analytics some facts about Jedox Facts about Jedox AG 2002: Founded in Freiburg, Germany Today: 2002 4 Offices

More information

Tekla Structures 18 Hardware Recommendation

Tekla Structures 18 Hardware Recommendation 1 (5) Tekla Structures 18 Hardware Recommendation Recommendations for Tekla Structures workstations Tekla Structures hardware recommendations are based on the setups that have been used in testing Tekla

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Building In-Database Predictive Scoring Model: Check Fraud Detection Case Study

Building In-Database Predictive Scoring Model: Check Fraud Detection Case Study Building In-Database Predictive Scoring Model: Check Fraud Detection Case Study Jay Zhou, Ph.D. Business Data Miners, LLC 978-726-3182 jzhou@businessdataminers.com Web Site: www.businessdataminers.com

More information

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs Fabian Hueske, TU Berlin June 26, 21 1 Review This document is a review report on the paper Towards Proximity Pattern Mining in Large

More information

Easily Identify Your Best Customers

Easily Identify Your Best Customers IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do

More information

Major upgrade versions. To see which features each version of Windows 7 has, go to Microsoft's Compare Windows page.

Major upgrade versions. To see which features each version of Windows 7 has, go to Microsoft's Compare Windows page. Windows 7 Upgrading to Windows 7 Introduction Page 1 Now that you have explored what Windows 7 has to offer, we can help you understand what's involved in moving to the new operating system. In this lesson,

More information

IPRO ecapture Performance Report using BlueArc Titan Network Storage System

IPRO ecapture Performance Report using BlueArc Titan Network Storage System IPRO ecapture Performance Report using BlueArc Titan Network Storage System Allen Yen, BlueArc Corp Jesse Abrahams, IPRO Tech, Inc Introduction IPRO ecapture is an e-discovery application designed to handle

More information

Leveraging Ensemble Models in SAS Enterprise Miner

Leveraging Ensemble Models in SAS Enterprise Miner ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to

More information

Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide. An Oracle White Paper October 2010

Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide. An Oracle White Paper October 2010 Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide An Oracle White Paper October 2010 Disclaimer The following is intended to outline our general product direction.

More information

Data Mining in the Swamp

Data Mining in the Swamp WHITE PAPER Page 1 of 8 Data Mining in the Swamp Taming Unruly Data with Cloud Computing By John Brothers Business Intelligence is all about making better decisions from the data you have. However, all

More information

Planning the Installation and Installing SQL Server

Planning the Installation and Installing SQL Server Chapter 2 Planning the Installation and Installing SQL Server In This Chapter c SQL Server Editions c Planning Phase c Installing SQL Server 22 Microsoft SQL Server 2012: A Beginner s Guide This chapter

More information

hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau

hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau Powered by Vertica Solution Series in conjunction with: hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau The cost of healthcare in the US continues to escalate. Consumers, employers,

More information

Data and Machine Architecture for the Data Science Lab Workflow Development, Testing, and Production for Model Training, Evaluation, and Deployment

Data and Machine Architecture for the Data Science Lab Workflow Development, Testing, and Production for Model Training, Evaluation, and Deployment Data and Machine Architecture for the Data Science Lab Workflow Development, Testing, and Production for Model Training, Evaluation, and Deployment Rosaria Silipo Marco A. Zimmer Rosaria.Silipo@knime.com

More information

IBM SPSS Data Preparation 22

IBM SPSS Data Preparation 22 IBM SPSS Data Preparation 22 Note Before using this information and the product it supports, read the information in Notices on page 33. Product Information This edition applies to version 22, release

More information

System Requirements Table of contents

System Requirements Table of contents Table of contents 1 Introduction... 2 2 Knoa Agent... 2 2.1 System Requirements...2 2.2 Environment Requirements...4 3 Knoa Server Architecture...4 3.1 Knoa Server Components... 4 3.2 Server Hardware Setup...5

More information

What s New in SPSS Statistics 17.0

What s New in SPSS Statistics 17.0 SPSS Statistics 17.0 New capabilities What s New in SPSS Statistics 17.0 Recognizing the increasingly critical role of analytics in helping organizations reach their goals, SPSS Inc. has made significant

More information

Vocera Voice 4.3 and 4.4 Server Sizing Matrix

Vocera Voice 4.3 and 4.4 Server Sizing Matrix Vocera Voice 4.3 and 4.4 Server Sizing Matrix Vocera Server Recommended Configuration Guidelines Maximum Simultaneous Users 450 5,000 Sites Single Site or Multiple Sites Requires Multiple Sites Entities

More information

Benchmarking Hadoop & HBase on Violin

Benchmarking Hadoop & HBase on Violin Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages

More information

Cisco IP Communicator (Softphone) Compatibility

Cisco IP Communicator (Softphone) Compatibility Cisco IP Communicator (Softphone) Compatibility Cisco IP Communicator is Windows based and works on both XP and Vista The minimum PC requirements for use with Microsoft Windows XP are: Microsoft Windows

More information

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS 9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence

More information

First, we ll look at some basics all too often the things you cannot change easily!

First, we ll look at some basics all too often the things you cannot change easily! Basic Performance Tips Purpose This document is inted to be a living document, updated often, with thoughts, tips and tricks related to getting maximum performance when using Tableau Desktop. The reader

More information

Hadoop & SAS Data Loader for Hadoop

Hadoop & SAS Data Loader for Hadoop Turning Data into Value Hadoop & SAS Data Loader for Hadoop Sebastiaan Schaap Frederik Vandenberghe Agenda What s Hadoop SAS Data management: Traditional In-Database In-Memory The Hadoop analytics lifecycle

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts Part V Applications Cloud Computing: General concepts Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 1 What is cloud computing? SaaS: Software as a Service Cloud: Datacenters hardware

More information

STATISTICA Solutions for Financial Risk Management Management and Validated Compliance Solutions for the Banking Industry (Basel II)

STATISTICA Solutions for Financial Risk Management Management and Validated Compliance Solutions for the Banking Industry (Basel II) STATISTICA Solutions for Financial Risk Management Management and Validated Compliance Solutions for the Banking Industry (Basel II) With the New Basel Capital Accord of 2001 (BASEL II) the banking industry

More information

Fast Analytics on Big Data with H20

Fast Analytics on Big Data with H20 Fast Analytics on Big Data with H20 0xdata.com, h2o.ai Tomas Nykodym, Petr Maj Team About H2O and 0xdata H2O is a platform for distributed in memory predictive analytics and machine learning Pure Java,

More information

In-Memory Data Management for Enterprise Applications

In-Memory Data Management for Enterprise Applications In-Memory Data Management for Enterprise Applications Jens Krueger Senior Researcher and Chair Representative Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University

More information

EMC Unified Storage for Microsoft SQL Server 2008

EMC Unified Storage for Microsoft SQL Server 2008 EMC Unified Storage for Microsoft SQL Server 2008 Enabled by EMC CLARiiON and EMC FAST Cache Reference Copyright 2010 EMC Corporation. All rights reserved. Published October, 2010 EMC believes the information

More information

IBM SPSS Direct Marketing 19

IBM SPSS Direct Marketing 19 IBM SPSS Direct Marketing 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This document contains proprietary information of SPSS

More information

Large-Scale Test Mining

Large-Scale Test Mining Large-Scale Test Mining SIAM Conference on Data Mining Text Mining 2010 Alan Ratner Northrop Grumman Information Systems NORTHROP GRUMMAN PRIVATE / PROPRIETARY LEVEL I Aim Identify topic and language/script/coding

More information

Change Manager 5.0 Installation Guide

Change Manager 5.0 Installation Guide Change Manager 5.0 Installation Guide Copyright 1994-2008 Embarcadero Technologies, Inc. Embarcadero Technologies, Inc. 100 California Street, 12th Floor San Francisco, CA 94111 U.S.A. All rights reserved.

More information

Scalable Machine Learning - or what to do with all that Big Data infrastructure

Scalable Machine Learning - or what to do with all that Big Data infrastructure - or what to do with all that Big Data infrastructure TU Berlin blog.mikiobraun.de Strata+Hadoop World London, 2015 1 Complex Data Analysis at Scale Click-through prediction Personalized Spam Detection

More information

Autodesk Inventor on the Macintosh

Autodesk Inventor on the Macintosh Autodesk Inventor on the Macintosh FREQUENTLY ASKED QUESTIONS 1. Can I install Autodesk Inventor on a Mac? 2. What is Boot Camp? 3. What is Parallels? 4. How does Boot Camp differ from Virtualization?

More information

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends Spring 2015 Thomas Hill, Ph.D. VP Analytic Solutions Dell Statistica Overview and Agenda Dell Software overview Dell in

More information

Dragon Medical Enterprise Network Edition Technical Note: Requirements for DMENE Networks with virtual servers

Dragon Medical Enterprise Network Edition Technical Note: Requirements for DMENE Networks with virtual servers Dragon Medical Enterprise Network Edition Technical Note: Requirements for DMENE Networks with virtual servers This section includes system requirements for DMENE Network configurations that utilize virtual

More information

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION QlikView Scalability Center Technical Brief Series September 2012 qlikview.com Introduction This technical brief provides a discussion at a fundamental

More information

Make Better Decisions Through Predictive Intelligence

Make Better Decisions Through Predictive Intelligence IBM SPSS Modeler Professional Make Better Decisions Through Predictive Intelligence Highlights Easily access, prepare and model structured data with this intuitive, visual data mining workbench Rapidly

More information

Dell Statistica 13.0. September 2015

Dell Statistica 13.0. September 2015 Dell Statistica 13.0 September 2015 These release notes provide information about the Dell Statistica 13.0 release. Topics: About this release New features o All Statistica Products o Statistica Advanced

More information

testo dello schema Secondo livello Terzo livello Quarto livello Quinto livello

testo dello schema Secondo livello Terzo livello Quarto livello Quinto livello Extracting Knowledge from Biomedical Data through Logic Learning Machines and Rulex Marco Muselli Institute of Electronics, Computer and Telecommunication Engineering National Research Council of Italy,

More information

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering DELL Virtual Desktop Infrastructure Study END-TO-END COMPUTING Dell Enterprise Solutions Engineering 1 THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL

More information

Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

More information

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

More information

BIG DATA MARKETING: THE NEXUS OF MARKETING, ANALYSTS, AND IT

BIG DATA MARKETING: THE NEXUS OF MARKETING, ANALYSTS, AND IT BIG DATA MARKETING: THE NEXUS OF MARKETING, ANALYSTS, AND IT The term Big Data is definitely a leading contender for the marketing buzz-phrase of 2012. On November 11, 2011, a Google search on the phrase

More information

Performance Tuning Guidelines for PowerExchange for Microsoft Dynamics CRM

Performance Tuning Guidelines for PowerExchange for Microsoft Dynamics CRM Performance Tuning Guidelines for PowerExchange for Microsoft Dynamics CRM 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Big data big talk or big results?

Big data big talk or big results? Whitepaper 28.8.2013 1 / 6 Big data big talk or big results? Authors: Michael Falck COO Marko Nikula Chief Architect marko.nikula@relexsolutions.com Businesses, business analysts and commentators have

More information