1 IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster with Statistics Server 6 Comparing performance between the Statistics Server and the Statistics client 7 Increase analyst productivity 7 Automating jobs with Statistics Server 8 Scoring new data with Statistics Server 8 Guidelines for purchasing Statistics Server 8 Conclusion 9 Appendix A: Description of local and distributed mode 10 Appendix B: Benchmark test details 13 Appendix C: Benchmark test results 14 About SPSS, an IBM Company Introduction is robust, powerful analytical software that seamlessly scales from handling the analytical needs of a single department to hundreds of users across the enterprise. It provides all of the features of IBM SPSS Statistics, plus capabilities that deliver faster performance, more efficient processing of large datasets and enhanced security in enterprise deployments. Statistics Server s client/server architecture, its ability to take advantage of multiple processors and cores, and its advanced analytical procedures specially tuned to work with large datasets enable organizations with massive amounts of data to optimize performance on data transformations, reporting, and analytics whether data resides in a central data center or across distributed offices. In benchmark testing designed to simulate a typical production environment, we found that most analytical procedures run faster on the Statistics Server than on the Statistics client 1, including: Data transformation procedures (add files, aggregates, match files, etc.) on average, 6 times faster on the Statistics Server Sort procedure on average, 3.35 times faster on the Statistics Server Commonly used model-building procedures (regression, GLM, Mixed, nomreg, etc.) on average, 3 times faster on Statistics Server This report discusses the high-performance capabilities available with Statistics Server, provides detailed benchmarking results and addresses other important benefits such as job automation, scheduling and scoring data. 1 The results described here are based on testing done in IBM SPSS laboratories. Although our test environments simulate typical production environments in the field, we cannot guarantee that organizations performing similar tests will see identical results. This data is presented for general guidance. Actual results will vary depending on the configuration of the Statistics Server and clients (number of CPU cores, RAM, disk speed, etc.). For more details on the benchmarking performed, see Appendix B.
2 Performance 101: Understanding the drivers of better performance A number of parameters can affect the performance of an analytical procedure, including the number of central processing units (CPUs) or cores, the amount of random access memory (RAM), the speed and configuration of the disk drives, and the location of the data being analyzed. Number of CPUs/cores Ideally, analytical procedures should run twice as fast on two CPUs, three times as fast on three CPUs, and so on. However, such perfect scalability is rarely achieved in reality, and the performance benefits of multiple CPUs/cores vary from procedure to procedure as explained below. Degree of parallelization This is the extent to which a procedure can be parallelized or broken into multiple independent tasks. Procedures that can be easily parallelized and scheduled to run simultaneously on different CPUs/ cores benefit the most. Procedures that are inherently serial or require a lot of disk I/O for example, crosstabs and frequencies will not benefit to a great extent from multiple CPUs/cores. Parallelization overhead This is the overhead associated with breaking up a procedure into independent tasks, scheduling each task and then merging the results. As operating systems and hardware platforms differ in the way tasks are partitioned and distributed across CPUs/cores, it is reasonable to expect the parallelization overhead to vary between platforms. Memory Memory, in the context of this paper, refers to the amount of physical RAM on the machine. For faster performance, it s best to have the entire dataset that an analytical procedure executes on in RAM. Accessing data from RAM is much faster than accessing data from a disk. If the dataset cannot be held in its entirely in RAM, there is a cost associated with swapping parts of the dataset between RAM and disk. Disk drives/computer storage devices Although there are several storage device technologies and configurations, high-end hard drives spin at 10,000 to 15,000 rpm, and can achieve sustained transfer rates up to 125 MB/sec. High-speed storage devices can dramatically improve performance when doing data transformations like sorts, merges, aggregates etc. on large datasets. Accessing files over a LAN vs. WAN Simply stated, a local area network (LAN) is the network technology used within an office to access datasets. A wide area network (WAN) is the network technology used across offices to access datasets. Although the speed of the LAN and WAN will vary depending on the type of 2
3 technology and the configuration, accessing files over a LAN is anywhere from 20 to 40 times faster than accessing files over a WAN. Performance of an analytical procedure is much faster when the dataset is accessed over a LAN than when it is accessed over a WAN. Why performance is faster with Statistics Server No need to transfer datasets between distributed offices The Statistics Server, when configured with the Statistics client in distributed mode (see Appendix A for a description of distributed mode), supports client/server architecture. In this configuration, the Statistics Server is installed in the central data center, in close proximity to the data. Users across the enterprise (in central and distributed offices) use the Statistics client to connect to the Statistics Server. All of the analytical processing and data access takes place on the Statistics Server; only the results of the analysis are transferred over the network to the Statistics client. This makes the Statistics Server an ideal solution for users in remote offices or users who travel frequently and require access to analytical capabilities on the go. As the need to transfer large datasets to end users desktops is eliminated, the data transferred over the network is minimized and performance is improved. This prevents bandwidth saturation and improves performance of not only the Statistics application, but other mission-critical applications as well, including , enterprise resource planning (ERP) and customer relationship management (CRM) and other network applications. We recommend Statistics Server for organizations with distributed offices that need to access files greater than 25 MB across offices. File Size Timing in seconds to access a data file Statistics client connecting directly to the data over a WAN (T1 3.0 Mbps) Statistics client connecting to the Statistics Server at the data center over a WAN (T1 3.0 Mbps) Time saved with Statistics Server in secs 50 MB 2 min, 10 secs 4 secs 2 min, 6 secs 250 MB 10 min, 50 secs 40 secs 10 min, 10 secs 1 GB 43 min, 17 secs 80 secs 41 min, 57 secs Table 1. Comparing time to access data using the Statistics client in local mode (accessing files in the data center directly over the WAN) vs. accessing the same data using the Statistics client to connect to the Statistics Server over the WAN, with data access handled by the Statistics Server 2 2 The results are based on the assumption that the available bandwidth is 3.0 Mbps. In reality, the time saved will be greater as bandwidth is taken up by other applications such as , network backups, etc. The data presented here is for illustrative purposes only. Actual results will vary depending on the configuration, bandwidth, and latency of the WAN; therefore, organizations performing similar tests may not see identical results. 3
4 As shown in Table 1, significant time savings can be achieved with Statistics Server when accessing files in distributed offices: for example, 2 minutes for a 25 MB file, 10 minutes for a 250 MB file, and 42 minutes for a 1 GB file. Multithreading Multithreading is the technical term used to break a task into multiple tasks that can be executed in parallel. As discussed above, not all analytical procedures can take advantage of multithreading. The procedures that are multithreaded in Statistics are listed in Table 2 below. In Statistics Server, there is no limit to the number of threads supported per procedure. The number of threads can be configured automatically for a user or group, or can be set manually. Users can also set the number of threads on a per procedure basis. Procedure family Correlations Regression Data Reduction Survival Analysis Multiple Imputation Procedure Name Bivariate Partial Linear Ordinal Multinomial Logistic Factor Analysis Cox Regression Logistic Regression Impute missing values Table 2: List of multithreaded analytical procedures As shown in Appendix C, the benefits of multithreading become more pronounced as the number of variables 3 increases (wide datasets). The results of the benchmark testing show that the performance of the following commonly used analytical procedures improved significantly as the number of threads increases from 4 to 16: 4 Linear regression procedure: improved by 52 percent Factor procedure: improved by 43 percent Cox regression procedure: improved by 24 percent Correlation procedure: improved by 24 percent Additional details on the benchmark tests that demonstrate the benefits of multithreading can be found in Appendix C. 3 The term variables refers to the number of columns or predictors in your dataset. 4 The results shown are based on testing done in SPSS, an IBM Company s laboratories. Although our test environments simulate typical production environments in the field, we cannot guarantee that organizations performing similar tests will see identical results. This data is presented for general guidance. Actual results will vary depending on the configuration of the Statistics Server and clients (number of CPU cores, RAM, disk speed, etc.) 4
5 Support for 64-bit computing The total amount of RAM supported depends on the processor. Theoretically, 32-bit processors are limited to accessing 4 GB of RAM. Typically, the RAM available to an application on a 32-bit machine is much lower for several reasons: Most machines with 32-bit processors are not configured with 4 GB of RAM because RAM is expensive The operating system requires some RAM as well Hence, on machines with 32-bit processors configured with the maximum amount of RAM, the RAM available to the application is approximately 2 to 3 GB. On machines with 64-bit processors, the amount of RAM supported is several multiples higher. Analytical procedures that run on large datasets will run much more slowly on a 32-bit machine than on a 64 bit machine because of the disk activity required to swap parts of the dataset into and out of RAM. SQL Pushback The Statistics Server supports the pushback of sorts and aggregates to a SQL database. When large datasets are sourced from a SQL database, SQL pushback ensures that operations that can be performed more efficiently in the database are performed there. Support for advanced analytical procedures tuned to work with large datasets with a lot of predictors Statistics Server supports advanced procedures like Naïve Bayes and the Predictor Selector algorithm that are specially designed for wide datasets with a large number of predictors. These analytical procedures are not available in the Statistics client when configured in local mode. Support for server operating systems and hardware The Statistics Server is designed to support server operating systems and hardware. Desktop operating systems, namely Windows XP and Vista, are limited to two processors or sockets 5. Server operating systems in general support a greater number of processors or sockets. As discussed above, procedures that can be parallelized run much faster on an operating system that supports a greater number of sockets or processors. Additionally, server operating systems have several sophisticated features that improve performance, scalability, and resilience. Unlike the Statistics Base client, which is limited to a maximum of four CPUs or cores, an analytical procedure performed on the Statistics Server can access an unlimited number of CPUs and cores. 5 The Windows Vista and XP do not limit the number of cores per socket. 5
6 Statistics Server is ideal for organizations with a single office that need to perform analysis on files that are greater than 100 MB Comparing performance between the Statistics Server and the Statistics client Results of specific procedures 6 run on both the Statistics Server and the Statistics client demonstrate that: Data transformation procedures (add files, aggregates, match files, etc.) run on average 6 times faster on the Statistics server Sort procedure runs on average 3.35 times faster on Statistics Server Commonly used modeling procedures such as regression, GLM, Mixed, and nomreg run on average 3 times faster on Statistics Server Rather than simply time several procedures independently, the benchmarking test was structured to simulate a typical job run in a production environment. Groups of related procedures were then assembled into test suites. This grouping was meant to reflect a certain type of analysis or data processing that a Statistics user might execute in the course of a day s work. Five test suites were developed as listed below: 1. Data transformations: add files, aggregates, case to variables, sort, etc. 2. Simple multi-threaded procedures: correlation, factor, etc. 3. Building models: GLM, mixed, nomreg 4. Data mining: trees 5. Statistical calculations: beta, srange, smod, poisson, etc. Groups of related procedures Time saved with Statistics Server Data transformations 64.95% 5.92 Sort 69.90% 3.35 Commonly used 47.52% 2.31 multi-threaded procedures (N=10M cases) Building models 62.19% 2.90 Data mining 43.98% 1.44 Statistical calculations 62.44% 2.90 AVERAGE 60.60% 2.54 Average speedup with Statistics Server Table 3: Benchmarking results for jobs run on Statistics Server and the Statistics client7 6 The results shown are based on testing done in IBM SPSS laboratories. Although our test environments simulate typical production environments in the field, we cannot guarantee that organizations performing similar tests will see identical results. This data is presented for general guidance. Actual results will vary depending on the configuration of the Statistics Server and clients (number of CPU cores, RAM, disk speed, etc.) 7 The results shown in Table 3 are based on testing done in IBM SPSS laboratories. Although our test environments simulate typical production environments in the field, we cannot guarantee that organizations performing similar tests will see identical results. This data is presented for general guidance. Actual results will vary depending on the configuration of the Statistics Server and clients (number of CPU cores, RAM, disk speed, etc.) 6
7 The results in Table 3 show that, on average, the Statistics Server is 2.54 times faster than the Statistics client (on a procedure basis), and the time saved on a typical Statistics job is 60.6 percent. Description of capability Supports client/ server architecture. Datasets don t have to be downloaded to a user s desktop. Supports multiple processors and cores Supports Server operating system and hardware Statistics Server Yes No limit to number of CPU and cores supported. Yes Statistics client configured in local mode No. All files need to be downloaded to the user s desktop. Number of threads is limited to 4. This limits the number of CPUs and cores supported to 4. No Table 4. The reasons why a job run on Statistics Server is faster than a job run on the Statistics client. Table 4 compares the capabilities of Statistics Server with those of the Statistics client configured to connect locally to illustrate why jobs can be run significantly faster using the server software. Additional information on the benchmarking tests, including the test suite procedures, dataset sizes and configuration of the Statistics Server and client, are provided in Appendix B. Analysts can run multiple analytical jobs at the same time while continuing to work on their desktops. Increase Analyst Productivity Statistics Server s high-performance capabilities enable organizations to achieve significant gains in productivity. When users are connected to a Statistics Server in distributed mode, they can initiate multiple analytical jobs concurrently. This is an important advantage over the client software, particularly when performing data transformation jobs on large datasets. Because all of the processing is done on the Statistics Server, users can continue to work on their desktops while running several jobs at the same time. Automating jobs with Statistics Server The Statistics batch facility available with Statistics Server is ideal for performing jobs that are repetitive and need to be performed at regular intervals. Efficiencies are realized as the manual tasks associated with running weekly, monthly or quarterly reports are minimized. 7
8 Additionally, when Statistics Server is used with IBM SPSS Collaboration and Deployment Services, these jobs can be scheduled automatically, leveraging this platform s content management and scheduling capabilities. Run time variables are supported, allowing the same job to be run multiple times with different input parameters. More importantly, the output of the job (the report, etc.) can be stored in the repository and accessed directly by business users through a dashboard. (A Web interface is available with Collaboration and Deployment Services.) Scoring new data with Statistics Server The Statistics Server ships with a scoring engine that allows new data to be scored. Users connected to Statistics Server in distributed mode can open one or more models created in Statistics, IBM SPSS Modeler or IBM SPSS AnswerTree, and score new data. This capability is not available with the Statistics client in local mode. Guidelines for purchasing Statistics Server The Statistics Server is especially designed for the following scenarios: Organizations with distributed offices looking to centralize their data and IT infrastructure in one or more data centers Organizations with distributed offices that need to analyze and share files greater than 25 MB across offices Organizations looking to virtualize applications and desktops using enabling technologies like Citrix Terminal Server. These servers are especially tuned to presenting applications and user interfaces and are not designed to handle the high CPU and I/O intensive work load of analytic jobs. Statistics Server ensures that the heavy processing is offloaded from the Citrix/Terminal server box and ensures better performance and availability. Organizations that need to perform analysis on large datasets (greater than 100 MB) sourced from a SQL server or a data warehouse Conclusion Statistics Server is sophisticated analytical server software that provides robust, scalable analytical capabilities when working with large datasets. It supports a client/server architecture that enables organizations to pursue a centralization strategy. Because large datasets do not have to move across offices for analysis, performance improves, resulting in greater analyst productivity and efficiency in distributed offices. 8
9 In addition, because Statistics Server is a foundational technology, organizations that invest in it can leverage it in many ways. For example, Statistics Server, when integrated with Collaboration and Deployment Services, enables them to: Automate scheduling of Statistics jobs Store the output of a Statistics job in a portal where it can be accessed by business users Deploy simplified analytical capabilities targeted to business users via a Web interface for jobs executed on Statistics Server When integrated with Modeler, Statistics Server enables organizations to: Take advantage of advanced data mining algorithms and a complementary, process-driven approach for building and scoring models Integrate advanced model management and deployment capabilities seamlessly with existing business processes Excel in today s fast-paced business environment by building and deploying many highly accurate models without requiring deep statistical expertise Appendix A: Description of local and distributed mode Local mode When running in local mode, all the analysis is performed on the user s desktop computer using the CPU resources on the desktop itself. All of the data that is being analyzed needs to be transferred to the local user s desktop (see Fig 1). If users are performing transformations on data located in a shared network resource, the transformed data must be transferred across the network to be saved on the file server or database. As the size of the data and the number of users increase, these data transfers can take up an appreciable amount of network bandwidth, adversely impacting network performance and the performance of other mission-critical applications like ERP, CRM, and that run on the network. This makes local mode more suitable for organizations with single offices and relatively smaller datasets. Figure 1. Statistics run in local mode. 9
10 Distributed mode In distributed mode, all the analysis is performed on the Statistics Server, located at the central datacenter (typically co-located with the data files). Because the analysis is performed on the Statistics Server, there is no need to transfer data to individual users desktops. As all the data transfers are localized between the Statistics Server and the file Server/database, performance is greatly improved. Only the results of the analysis typically a fraction of the size of the original data are transferred to the Statistics client. Figure 2. Statistics in distributed mode. Appendix B: Benchmark test details Configuration All the testing was done using the batch facility 8 or Statistics. Datasets were local to the Statistics Server. It is reasonable to expect similar results when using a Statistics client to connect to the Statistics Server (distributed mode). When comparing the performance between running a job using the batch facility vs. running the same job in distributed mode, there is a small overhead associated with distributed mode. This is because in distributed mode, the results of the analysis get transferred across the network from the Statistics Server to the end users machine. In batch facility, the results of the analysis are written to a disk drive/network share accessible to the Statistics Server. As the output of the analysis is typically small in size, the overhead associated with transferring this output on a properly configured network is minimal. Repeated trials To help control for the chance variation of any single test run, each test suite was repeated three times. The average time in seconds is reported. 8 Typically the client for Statistics server is the Statistics client running on a desktop computer. The Statistics Server batch facility is an alternative way to use the power of the Statistics Server. StatisticsB is a command line executable that runs on the server computer where the Statistics Server is installed. StatisticsB is intended for automated production of statistical reports. Automated production provides the ability to run analyses without user intervention. Automated production is advantageous if users are required to perform repetitive time-consuming analyses, such as weekly reports. StatisticsB takes as its input a syntax file containing the data transformation and/or analytical procedures to run, with several command line arguments to control the format of or customize the output. 10
11 Configuration of the Statistics Server CPU: 4 CPUs, Intel Xeon 3 GHz, dual core Hyper threaded RAM: 8 GB Operating system: Windows 2003 Server, 64-bit Configuration of Statistics client CPU: 1 CPU, Intel T 7500, 2.19GHz, dual core RAM: 3 GB Operating system: Windows XP, 32-bit Details on the dataset Two datasets were used: Dataset 1: Size 2.1 GB, 5 million cases, 127 variables Dataset 2: Size 3 GB, 10 million cases, 127 variables (used for simple multithreaded procedures; see table 5 for details) Groups of related procedures Statistics Server (64 bit)* Statistics Client (32 bit)** Time saved Average speedup (multiple of times faster) Data transformations ADD FILES % 9.18 AGGREGATE % 2.86 CASESTOVARS % 0.90 MATCH FILES % VARSTOCASES % 3.88 UNIFORM (Simple COMPUTE) % 5.71 Average time saved 61.44% Average speedup 6.02 Sort SORT NUMERIC % 3.94 SORT STRING % 2.87 Average time saved 69.90% Average speedup 3.35 Table 5: Benchmarking data comparing Statistics Server with the Statistics client 9 Number of threads 8 Number of threads 2 9 The data shown is based on testing done in IBM SPSS laboratories. Although our test environments simulate typical production environments in the field, we cannot guarantee that organizations performing similar tests will see identical results. This data is presented for general guidance. Actual results will vary depending on the configuration of the Statistics Server and clients (number of CPU cores, RAM, disk speed, etc.) 11
12 Groups of related procedures Statistics Server (64 bit)* Statistics Client (32 bit)** Time saved Average speedup (multiple of times faster) Simple Multithreaded Procedures (N=10M) CORRELATION % 3.47 FACTOR % 1.56 PARTIAL CORR % 1.54 REGRESSION (120 dependent variables) % 1.94 Average time saved 47.52% Average speedup 2.31 Building Models GLM % 5.01 MIXED % 1.50 NOMREG % 2.97 REGRESSION % 3.24 Average time saved 62.19% Average speedup 2.90 Data Mining TREES % 1.44 Average time saved 43.98% Average speedup 1.44 Statistical Calculations BETA % 2.65 CFVAR & BETA % 3.39 POISSON BERNOULLI % 2.20 Average time saved 62.44% Average speedup 2.90 Total Time % 2.54 Table 5 (continued) Number of threads 8 Number of threads 2 12
13 Appendix C: Benchmark test results Number of threads Multi-threaded procedure names File Size Number of cases Number of variables Time in seconds Time saved in seconds Discriminant 351MB 200, % 5.88% Csscoxreg % 23.76% Sort 2.7GB 2,000, % 13.54% Csordinal % -8.97% Cslogistic 48MB 100, % 18.48% Linear regression 703MB 200, % 52.34% Factor 703MB 200, % 43.30% Correlation % 24.35% Partially correlated % 29.80% Nomreg % 18.47% Csselect % 1.34% TOTAL TIME Percentage time saved overall 20.76% 27.60% Table 6: Benchmarking results demonstrating performance improvements as the number of threads increases 10. As the number of threads increases from 4 to 16: The linear regression procedure improves by 52 percent The factor procedure improves by 43 percent The COX regression procedure improves by 24 percent The correlation procedure improves by 24 percent Overall, performance for the multithreaded procedures increases by percent as the number of threads increases from 4 to 8 10 The data shown is based on testing done in IBM SPSS laboratories. Although our test environments simulate typical production environments in the field, we cannot guarantee that organizations performing similar tests will see identical results. This data is presented for general guidance. Actual results will vary depending on the configuration of the Statistics Server and clients (number of CPU cores, RAM, disk speed, etc.). 13
14 About SPSS, an IBM Company SPSS, an IBM Company, is a leading global provider of predictive analytics software and solutions. The company s complete portfolio of products - data collection, statistics, modeling and deployment - captures people s attitudes and opinions, predicts outcomes of future customer interactions, and then acts on these insights by embedding analytics into business processes. IBM SPSS solutions address interconnected business objectives across an entire organization by focusing on the convergence of analytics, IT architecture and business process. Commercial, government and academic customers worldwide rely on IBM SPSS technology as a competitive advantage in attracting, retaining and growing customers, while reducing fraud and mitigating risk. SPSS was acquired by IBM in October For further information, or to reach a representative, visit Copyright IBM Corporation 2010 SPSS Inc., an IBM Company Headquarters, 233 S. Wacker Drive, 11th floor Chicago, Illinois SPSS is a registered trademark and the other SPSS products named are trademarks of SPSS Inc., an IBM Company SPSS Inc., an IBM Company. All Rights Reserved. IBM and the IBM logo are trademarks of International Business Machines Corporation in the United States, other countries or both. For a complete list of IBM trademarks, see Other company, product and service names may be trademarks or service marks of others. References in this publication to IBM products or services do not imply that IBM intends to make them available in all countries in which IBM operates. Any reference in this information to non-ibm Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. Please Recycle YTW03038USEN-00
IBM SPSS Statistics IBM SPSS Statistics Performance Best Practices Contents Overview... 3 Target User... 3 Introduction... 3 Methods of Problem Diagnosis... 3 Performance Logging for Statistics Server...
June 2010 From: StatSoft Analytics White Papers To: Internal release Re: Performance comparison of STATISTICA Version 9 on multi-core 64-bit machines with current 64-bit releases of SAS (Version 9.2) and
IBM SPSS Modeler Professional Make Better Decisions Through Predictive Intelligence Highlights Easily access, prepare and model structured data with this intuitive, visual data mining workbench Rapidly
The Top 10 Secrets to Using Data Mining to Succeed at CRM Discover proven strategies and best practices Highlights: Plan and execute successful data mining projects. Understand the roles and responsibilities
IBM Software Business Analytics IBM SPSS Modeler Solve Your Toughest Challenges with Data Mining Use predictive intelligence to make good decisions faster Solve Your Toughest Challenges with Data Mining
IBM SPSS Modeler Using Data Mining to Detect Insurance Fraud Improve accuracy and minimize loss Highlights: Combine powerful analytical techniques with existing fraud detection and prevention efforts Build
Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis
DELL Virtual Desktop Infrastructure Study END-TO-END COMPUTING Dell Enterprise Solutions Engineering 1 THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL
Dragon NaturallySpeaking and citrix A White Paper from Nuance Communications March 2009 Introduction As the number of deployed enterprise applications increases, organizations are seeking solutions that
Making Critical Connections: Predictive Analytics in Improve strategic and tactical decision-making Highlights: Support data-driven decisions. Reduce fraud, waste and abuse. Allocate resources more effectively.
EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving
Automatic, continuous, and secure protection that backs up data to the cloud, or via a hybrid approach combining on-premise and cloud-based backup. Data Sheet: Symantec.cloud Only 21 percent of SMBs are
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
IBM Software Business Analytics Cognos Business Intelligence IBM Cognos 10: Enhancing query processing performance for IBM Netezza appliances 2 IBM Cognos 10: Enhancing query processing performance for
SYSTEM X SERVERS SOLUTION BRIEF Maximum performance, minimal risk for data warehousing Microsoft Data Warehouse Fast Track for SQL Server 2014 on System x3850 X6 (95TB) The rapid growth of technology has
The IBM Cognos Platform for Enterprise Business Intelligence Highlights Optimize performance with in-memory processing and architecture enhancements Maximize the benefits of deploying business analytics
Unprecedented Performance and Scalability Demonstrated For Meter Data Management: Ten Million Meters Scalable to One Hundred Million Meters For Five Billion Daily Meter Readings Performance testing results
Symantec Endpoint Protection 11.0 Architecture, Sizing, and Performance Recommendations Technical Product Management Team Endpoint Security Copyright 2007 All Rights Reserved Revision 6 Introduction This
White Paper EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server Abstract This white paper addresses the challenges currently facing business executives to store and process the growing
An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct
Performance & Scalability of SAS Business Analytics on an NEC Express5800/A1080a (Intel Xeon 7500 series-based Platform) using Red Hat Enterprise Linux 5 SAS Business Analytics Base SAS for SAS 9.2 Red
High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances Highlights IBM Netezza and SAS together provide appliances and analytic software solutions that help organizations improve
SYSTEM X SERVERS SOLUTION BRIEF Minimize cost and risk for data warehousing Microsoft Data Warehouse Fast Track for SQL Server 2014 on System x3850 X6 (55TB) Highlights Improve time to value for your data
Muse Server Sizing 18 June 2012 Document Version 0.0.1.9 Muse 126.96.36.199 Notice No part of this publication may be reproduced stored in a retrieval system, or transmitted, in any form or by any means, without
White Paper Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III Performance of Microsoft SQL Server 2008 BI and D/W Solutions on Dell PowerEdge
Kaseya Kaseya IT Automation Framework An Integrated solution designed for reducing complexity while increasing productivity for IT Professionals and Managed Service Providers. The powerful, web-based automation
Introduction 1 Performance on Hosted Server 1 Figure 1: Real World Performance 1 Benchmarks 2 System configuration used for benchmarks 2 Figure 2a: New tickets per minute on E5440 processors 3 Figure 2b:
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide An Oracle White Paper July 2011 1 Disclaimer The following is intended to outline our general product direction.
Deployment Planning Guide August 2011 Copyright: 2011, CCH, a Wolters Kluwer business. All rights reserved. Material in this publication may not be reproduced or transmitted in any form or by any means,
IBM SPSS Modeler Professional Make better decisions through predictive intelligence Highlights Create more effective strategies by evaluating trends and likely outcomes. Easily access, prepare and model
White Paper Informatica Ultra Messaging SMX Shared-Memory Transport Breaking the 100-Nanosecond Latency Barrier with Benchmark-Proven Performance This document contains Confidential, Proprietary and Trade
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.
System Requirements for Microsoft Dynamics GP 2015 This document contains the minimum client hardware requirements, server recommendations and Terminal Server minimum hardware requirements supported by
White Paper Recording Server Virtualization Prepared by: Mike Sherwood, Senior Solutions Engineer Milestone Systems 23 March 2011 Table of Contents Introduction... 3 Target audience and white paper purpose...
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
Inmagic Content Server v1.3 Technical Guidelines 6/2005 Page 1 of 15 Inmagic Content Server Standard and Enterprise Configurations Technical Guidelines Last Updated: June, 2005 Inmagic, Inc. All rights
Gladstone Health & Leisure Technical Services Plus2 Environment Server Recommendations Commercial in Confidence Database Server Specifications Database server specifications are based on sizes in use on
WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance
Inmagic Content Server Workgroup Configuration Technical Guidelines 6/2005 Page 1 of 12 Inmagic Content Server Workgroup Configuration Technical Guidelines Last Updated: June, 2005 Inmagic, Inc. All rights
SQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V White Paper July 2011 Contents Executive Summary... 3 Introduction... 3 Audience and Scope... 4 Today s Challenges...
1 VMWARE WHITE PAPER Introduction This paper outlines the considerations that affect network throughput. The paper examines the applications deployed on top of a virtual infrastructure and discusses the
Sage 100 Standard ERP Version 2013 The information in this document applies to Sage 100 Standard ERP Version 2013 1. Detailed product update information and support policies can be found on the Sage Online
Performance and scalability of a large OLTP workload ii Performance and scalability of a large OLTP workload Contents Performance and scalability of a large OLTP workload with DB2 9 for System z on Linux..............
Virtualization Guide McAfee Vulnerability Manager Virtualization COPYRIGHT Copyright 2012 McAfee, Inc. Do not copy without permission. TRADEMARKS McAfee, the McAfee logo, McAfee Active Protection, McAfee
Description: This application note aims to assist you in choosing the right edition of Microsoft SQL server for your ICONICS applications. OS Requirement: XP Win 2000, XP Pro, Server 2003, Vista, Server
QLIKVIEW ARCHITECTURE AND SYSTEM RESOURCE USAGE QlikView Technical Brief April 2011 www.qlikview.com Introduction This technical brief covers an overview of the QlikView product components and architecture
Technical Report Comparing the Network Performance of Windows File Sharing Environments Dan Chilton, Srinivas Addanki, NetApp September 2010 TR-3869 EXECUTIVE SUMMARY This technical report presents the
IOmark- VDI HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Copyright 2010-2014 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VM, VDI- IOmark, and IOmark
White Paper EMC XtremSF: Delivering Next Generation Performance for Oracle Database Abstract This white paper addresses the challenges currently facing business executives to store and process the growing
IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced
VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop Page 1 of 11 Introduction Virtual Desktop Infrastructure (VDI) provides customers with a more consistent end-user experience and excellent
HP Asset Manager Asset Manager 5.10 Sizing Guide Using the Oracle Database Server, or IBM DB2 Database Server, or Microsoft SQL Server Legal Notices... 2 Introduction... 3 Asset Manager Architecture...
Netezza Business Partner Update: November 17, 2011 Netezza and Business Analytics Synergy Shimon Nir, IBM Agenda Business Analytics / Netezza Synergy Overview Netezza overview Enabling the Business with
SUN ORACLE EXADATA STORAGE SERVER KEY FEATURES AND BENEFITS FEATURES 12 x 3.5 inch SAS or SATA disks 384 GB of Exadata Smart Flash Cache 2 Intel 2.53 Ghz quad-core processors 24 GB memory Dual InfiniBand
JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra January 2014 Legal Notices Apache Cassandra, Spark and Solr and their respective logos are trademarks or registered trademarks
AssetWise Performance Management APM Installation Prerequisites Trademark Notice Bentley, the B Bentley logo, AssetWise, Ivara, the Ivara EXP logo, Ivara Work Smart, Aladon and RCM2 are either registered
IBM PureFlex and Atlantis ILIO: Cost-effective, high-performance and scalable persistent VDI Highlights Lower than PC cost: saves hundreds of dollars per desktop, as storage capacity and performance requirements
Running FileMaker Pro 5.0v3 on Windows 2000 Terminal Services 2000 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 www.filemaker.com FileMaker
Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820 This white paper discusses the SQL server workload consolidation capabilities of Dell PowerEdge R820 using Virtualization.
SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server
Predictive analytics with System z Faster, broader, more cost effective access to critical insights Highlights Optimizes high-velocity decisions that can consistently generate real business results Integrates
An Oracle White Paper March 2013 Load Testing Best Practices for Oracle E- Business Suite using Oracle Application Testing Suite Executive Overview... 1 Introduction... 1 Oracle Load Testing Setup... 2
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
Quantum StorNext Product Brief: Distributed LAN Client NOTICE This product brief may contain proprietary information protected by copyright. Information in this product brief is subject to change without
Inmagic Content Server v9.0 Standard Configuration Technical Guidelines 5/2006 Page 1 of 15 Inmagic Content Server v9 Standard Configuration Technical Guidelines Last Updated: May, 2006 Inmagic, Inc. All
Cisco for SAP HANA Scale-Out Solution Solution Brief December 2014 With Intelligent Intel Xeon Processors Highlights Scale SAP HANA on Demand Scale-out capabilities, combined with high-performance NetApp
Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects
W H I T E P A P E R : T E C H N I C A L Understanding and Configuring Symantec Endpoint Protection Group Update Providers Martial Richard, Technical Field Enablement Manager Table of Contents Content Introduction...
Powered by Vertica Solution Series in conjunction with: hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau The cost of healthcare in the US continues to escalate. Consumers, employers,
2 WHITE PAPER: BEST PRACTICES Sizing and Scalability Recommendations for Symantec Rev 2.3 Symantec Enterprise Security Solutions Group White Paper: Symantec Best Practices Contents Introduction... 4 The
White Paper Dell Microsoft - Reference Configurations Deploying Microsoft SQL Server 2005 Business Intelligence and Data Warehousing Solutions on Dell PowerEdge Servers and Dell PowerVault Storage Abstract
1 WWW.FUSIONIO.COM WHITE PAPER WHITE PAPER Executive Summary Fusion iovdi is the first desktop- aware solution to virtual desktop infrastructure. Its software- defined approach uniquely combines the economics
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework Many corporations and Independent Software Vendors considering cloud computing adoption face a similar challenge: how should
Symantec Backup Exec 10d System Sizing Best Practices For Optimizing Performance of the Continuous Protection Server Table of Contents Table of Contents...2 Executive Summary...3 System Sizing and Performance
Holistic Performance Analysis of J2EE Applications By Madhu Tanikella In order to identify and resolve performance problems of enterprise Java Applications and reduce the time-to-market, performance analysis
Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide An Oracle White Paper October 2010 Disclaimer The following is intended to outline our general product direction.
EMC Unified Storage for Microsoft SQL Server 2008 Enabled by EMC CLARiiON and EMC FAST Cache Reference Copyright 2010 EMC Corporation. All rights reserved. Published October, 2010 EMC believes the information
Sage Grant Management System Requirements You should meet or exceed the following system requirements: One Server - Database/Web Server The following system requirements are for Sage Grant Management to
Windows Embedded Security and Surveillance Solutions Windows Embedded 2010 Page 1 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues