SAS Business Analytics. Base SAS for SAS 9.2



Similar documents
Vertical Scaling of Oracle 10g Performance on Red Hat Enterprise Linux 5 on Intel Xeon Based Servers. Version 1.0

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Integrated Grid Solutions. and Greenplum

HP reference configuration for entry-level SAS Grid Manager solutions

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

FOR SERVERS 2.2: FEATURE matrix

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE FEATURES

3 Red Hat Enterprise Linux 6 Consolidation

Business-centric Storage for small and medium-sized enterprises. How ETERNUS DX powered by Intel Xeon processors improves data management

Scaling Microsoft Exchange in a Red Hat Enterprise Virtualization Environment

Business-centric Storage for small and medium-sized enterprises. How ETERNUS DX powered by Intel Xeon processors improves data management

SQL Server Virtualization

How to Choose your Red Hat Enterprise Linux Filesystem

Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

EMC Unified Storage for Microsoft SQL Server 2008

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Virtuoso and Database Scalability

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

Understanding the Benefits of IBM SPSS Statistics Server

Technical Paper. Performance and Tuning Considerations for SAS on the EMC XtremIO TM All-Flash Array

Cisco Prime Home 5.0 Minimum System Requirements (Standalone and High Availability)

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

Microsoft Private Cloud Fast Track Reference Architecture

The Advantages of Multi-Port Network Adapters in an SWsoft Virtual Environment

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

A Survey of Shared File Systems

Technical Paper. Performance and Tuning Considerations for SAS on Fusion-io ioscale Flash Storage

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Performance and scalability of a large OLTP workload

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

Performance and Scalability of the Red Hat Enterprise Healthcare Platform

Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers

Maximum performance, minimal risk for data warehousing

Red Hat Enterprise Linux 6. Stanislav Polášek ELOS Technologies

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

Technical Paper. Performance and Tuning Considerations for SAS on Pure Storage FA-420 Flash Array

Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems

SAN Conceptual and Design Basics

EMC Backup and Recovery for Microsoft SQL Server

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

AirWave 7.7. Server Sizing Guide

PARALLELS CLOUD STORAGE

Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM

White Paper. Enhancing Storage Performance and Investment Protection Through RAID Controller Spanning

SQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V

Analyzing the Virtualization Deployment Advantages of Two- and Four-Socket Server Platforms

Benchmarking Cassandra on Violin

Benchmarking Hadoop & HBase on Violin

SUN ORACLE EXADATA STORAGE SERVER

An Oracle White Paper August Oracle VM 3: Server Pool Deployment Planning Considerations for Scalability and Availability

How To Make A Backup System More Efficient

TekSouth Fights US Air Force Data Center Sprawl with iomemory

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

8Gb Fibre Channel Adapter of Choice in Microsoft Hyper-V Environments

InterSystems and VMware Increase Database Scalability for Epic EMR Workload by 60 Percent with Intel Xeon Processor E7 v3 Family

Boost Database Performance with the Cisco UCS Storage Accelerator

Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V. Reference Architecture

Infortrend ESVA Family Enterprise Scalable Virtualized Architecture

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments

Hitachi Virtage Embedded Virtualization Hitachi BladeSymphony 10U

Evaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation

How To Connect Virtual Fibre Channel To A Virtual Box On A Hyperv Virtual Machine

Express5800 Scalable Enterprise Server Reference Architecture. For NEC PCIe SSD Appliance for Microsoft SQL Server

How System Settings Impact PCIe SSD Performance

PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation

7 Real Benefits of a Virtual Infrastructure

The MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5.

Windows Server 2008 R2 Hyper-V Live Migration

Cisco, Citrix, Microsoft, and NetApp Deliver Simplified High-Performance Infrastructure for Virtual Desktops

Optimizing Web Infrastructure on Intel Architecture

Best Practices for Implementing iscsi Storage in a Virtual Server Environment

HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN

EMC Backup and Recovery for Microsoft SQL Server

Red Hat enterprise virtualization 3.0 feature comparison

EMC XTREMIO EXECUTIVE OVERVIEW

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

HP StorageWorks P2000 G3 and MSA2000 G2 Arrays

Distribution One Server Requirements

Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator

The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays

White paper 200 Camera Surveillance Video Vault Solution Powered by Fujitsu

White Paper. Recording Server Virtualization

ARCHITECTING COST-EFFECTIVE, SCALABLE ORACLE DATA WAREHOUSES

Unitrends Recovery-Series: Addressing Enterprise-Class Data Protection

Software-defined Storage at the Speed of Flash

System and Storage Virtualization For ios (AS/400) Environment

Optimizing Large Arrays with StoneFly Storage Concentrators

Maximizing SQL Server Virtualization Performance

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Virtual SAN Design and Deployment Guide

Increase Database Performance by Implementing Cirrus Data Solutions DCS SAN Caching Appliance With the Seagate Nytro Flash Accelerator Card

Transcription:

Performance & Scalability of SAS Business Analytics on an NEC Express5800/A1080a (Intel Xeon 7500 series-based Platform) using Red Hat Enterprise Linux 5 SAS Business Analytics Base SAS for SAS 9.2 Red Hat Enterprise Linux 5.4 NEC Express5800/A1080a 8 Socket Intel Xeon 7500 series-based platform (8 Cores / Socket = 64 Cores 2 Threads / Core = 128 Threads) Version 1.0 April 2010

Performance & Scalability of SAS Business Analytics on an NEC Express5800/A1080a (Intel Xeon 7500 series-based Platform) using Red Hat Enterprise Linux 5 1801 Varsity Drive Raleigh NC 27606-2072 USA Phone: +1 919 754 3700 Phone: 888 733 4281 Fax: +1 919 754 3701 PO Box 13588 Research Triangle Park NC 27709 USA "Red Hat," Red Hat Enterprise Linux, the Red Hat "Shadowman" logo, and the products listed are trademarks or registered trademarks of Red Hat, Inc. in the United States and other countries. Linux is a registered trademark of Linus Torvalds. All other trademarks referenced herein are the property of their respective owners. 2010 by Red Hat, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, V1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/). The information contained herein is subject to change without notice. Red Hat, Inc. shall not be liable for technical or editorial errors or omissions contained herein. Distribution of modified versions of this document is prohibited without the explicit permission of Red Hat Inc. Distribution of the work or derivative of the work in any standard (paper) book form for commercial purposes is prohibited unless prior permission is obtained from Red Hat Inc. The GPG fingerprint of the security@redhat.com key is: CA 20 86 86 2B D6 9D FC 65 F6 EC C4 21 91 80 CD DB 42 A6 0E www.redhat.com 2

Table of Contents 1 Executive Summary... 5 2 Test Configuration... 6 2.1 Hardware... 7 NEC Express5800/A1080a Server... 7 NEC D3 SAN Storage Array... 8 Intel Xeon Processor 7500 Series... 8 2.2 SAS 9.2... 11 2.3 Red Hat Enterprise Linux 5.4... 11 3 Test Methodology... 12 3.1 Test Execution... 12 3.2 Data... 13 4 Performance Results... 14 4.1 Performance Effects of NUMA... 18 4.2 RHEL Observations & Tuning... 20 4.3 SAS Tuning Guidelines... 21 5 Conclusions... 22 3 www.redhat.com

www.redhat.com 4

1 Executive Summary Today, companies are increasingly utilizing analytics to discover new revenue and costsaving opportunities. Many business professionals turn to SAS, a leader in business analytics software and service, to help them improve performance and make better decisions faster. Analytics are also being employed in risk management, fraud detection, life sciences, sports, and many more emerging markets. However, to maximize the value to the business, analytics solutions need to be deployed quickly and cost-effectively, while also providing the ability to readily scale without degrading performance. Of course, in today s demanding environments, where budgets are still shrinking and mandates to reduce carbon footprints are growing, the solution must deliver excellent hardware utilization, power efficiency, and ROI. To help solve these challenges, Red Hat, SAS, NEC, and Intel collaborated to prove the linear scalability of SAS 9.2 running Red Hat Enterprise Linux on NEC s newest Intel Xeon processor 7500 series-based platform. The result is a pre-tested, scalable server configuration that can help you deploy faster, reduce risk, lower cost, and plan for future upgrades. Game changing performance and scalability The ability to take advantage of performance enhancements in the latest processors and the ability to tune I/O and virtual memory, makes Red Hat Enterprise Linux an ideal platform for SAS Business Analytics. Industry benchmarks reflect the scalability and performance of Red Hat Enterprise Linux in both scale-up (vertical) and scale-out (grid) models. The SAS 9.2 test results documented here demonstrate excellent scalability up to 64 cores (128 threads) on a single system. 5 www.redhat.com

2 Test Configuration System Configuration: NEC Express5800/A1080a system 8 Intel Xeon processor 7500 series (8 cores per socket) 64 total cores/128 threads Intel Hyper-threading Technology on Intel Turbo Boost Technology on 256 GB RAM 4 x external NEC D-series storage arrays 30 x 1 TB SATA 7200 RPM disk per array 16 x 4 G-bit fiber connections to disk Note: Half the CPU cores, RAM and storage was used for the 32-core test. SAS Software: Foundation SAS 9.2, SAS/STAT Operating System: Red Hat Enterprise Linux 5.4z 64-bit www.redhat.com 6

2.1 Hardware NEC Express5800/A1080a Server Figure 1 Designed specifically for the Intel Xeon processor 7500 series and scalable from 8 to 64 processor cores and up to 128 threads, the NEC Express5800/A1080a server is an ideal 7 www.redhat.com

platform for scaling up or consolidating SAS 9.2 instances. This server takes advantage of the intelligent performance, energy efficiency, and virtualization capabilities of the Intel Xeon processor and leverages the low latency and high-performance interface technology of NEC s supercomputers. The server supports modular designs, redundant components, hot plug capabilities, and floating I/O. Additional benefits include: Up to 64 cores, 128 threads, and 2 TB memory Integrated Intel Quick-Path Interconnect technology to increase performance through efficient memory access Green Cooling Technology that helps to minimize power consumption and automate power usage for more effective datacenter consolidation benefits Built-in service processor working in conjunction with NEC s BIOS and Intel s Machine Check Architecture to provide reliability, availability, and serviceability for mission-critical computing Server virtualization that offers high performance, energy efficiency, and higher server bandwidth to handle the increased communications in a virtualized environment NEC D3 SAN Storage Array The NEC D3 SAN Storage Array offers scalability to 288 TB and aggregate throughput of over 1,100 MB/s. The system is fully redundant to protect against single point of failure, with a battery backup unit to protect its 4 GB of cache. Features include replication, snapshots, performance monitoring, automatic tuning, multipathing, and failover. Intel Xeon Processor 7500 Series Intel Xeon Processor 7500 Series is built to handle your most processor-intensive, mission-critical applications, the Intel Xeon processor 7500 series delivers a quantum leap in enterprise computing performance. The Intel Xeon processor 7500 series combines up to eight cores and 16 processing threads in a single device and offers four advanced, high-bandwidth interconnect links that allow multiple processors to be directly connected to each other. The result is unprecedented scalability. www.redhat.com 8

Figure 2 The Intel Xeon processor 7500 series intelligently adjusts performance and energy consumption to accommodate application needs. Built-in Intel Turbo Boost Technology automatically speeds up the processor when your SAS workload requires extra performance. Intel Hyper-Threading Technology allows each processor core to work on two tasks at the same time to enhance performance for highly-threaded workloads. Intel Intelligent Power Technology automatically places CPUs and memory into an optimal power state for maximum performance, while reducing energy use. 9 www.redhat.com

Figure 3 www.redhat.com 10

2.2 SAS 9.2 SAS 9.2 provides the core components of the SAS Business Analytics Framework and significant performance improvements over SAS 9.1 on Linux. SAS 9.2 helps users gain insights that are often hidden in data, so they can reach evidence-based decisions with confidence. SAS 9.2 supports the entire analysis process from data access to the point of decision however varied or complex. A wide range of data integration techniques empowers users to collect, classify, process, analyze, and interpret data to reveal new insights. SAS 9.2 advances the capabilities of SAS analytical products, including forecasting, data mining, optimization, and model management. SAS Analytics provide rapid answers to key business questions, allowing decision makers to react more quickly to fast changing conditions. SAS 9.2 is available as 64-bit enabled applications supporting 64-bit extended architectures. This enables you to scale up or consolidate multiple SAS instances within one affordable, powerful, commodity system. 2.3 Red Hat Enterprise Linux 5.4 Red Hat Enterprise Linux performance features include: SMP performance and scalability. Multi-process or threaded applications can be optimally scheduled in large SMP systems. A vast virtual address space enables SAS Business Analytics to effectively use more memory to work on larger data sets. Enhancements enable applications to effectively use more processors. Intelligent performance. Efficient use of software threads, support for hyperthreading technology, and the ability to change the clock speed on a running processor increase performance. Automated energy efficiency. Power technologies from Intel and AMD and optimizations in the operating system lower power consumption during off-peak times. Consuming less power means lower cooling requirements, which contributes to further savings and greener datacenters. Tuning for optimum I/O throughput. I/O performance can be optimized on a perdevice basis. Support for 10 gigabit Ethernet, iscsi, and Fibre Channel over Ethernet allow the latest storage technologies to be used. MPIO allows multiple connections from servers to storage to increase availability and throughput. 11 www.redhat.com

3 Test Methodology SAS created multiuser benchmarking scenarios to simulate the workload for a typical Foundation SAS customer. The goal of these scenarios was to evaluate the multiuser performance of SAS on various platforms. Various-sized mixed analytic workloads were created to simulate many users utilizing CPU, RAM and I/O resources that SAS programs heavily use during typical program execution. A 32-core and 64-core test scenario was executed with the help of Intel, Red Hat and NEC staff on a 64-core NEC Express5800/A1080a system running Red Hat Enterprise Linux 5.4z. 3.1 Test Execution The two scenarios used in the performance efforts included the following: 1) Mixed 32-core workload mix of CPU- and I/O-intensive jobs: a. 206 jobs launched during the scenario. b. PROCS: GLM, LOGISIC, RISK, REG, MEANS, SORT, FREQ, SUMMARY, SQL. c. Mixed shorter and longer running jobs. d. Goal of scenario is to leverage a mix of I/O and CPU resources. e. This version was designed to run on a 32-core system. 2) Mixed 64-core workload mix of CPU- and I/O-intensive jobs: a. 412 jobs launched during the scenario. b. This version was designed to run on a 64-core system. c. This version has twice the jobs as the 32-core test. www.redhat.com 12

Each test scenario consists of a set of SAS jobs run in a multiuser fashion to simulate a typical SAS batch, SAS Enterprise Guide user environment. All SAS jobs are a combination of computational- and I/O-intensive SAS procedures. Each scenario launches jobs simultaneously at a set interval to help simulate a multiuser environment where users come and go from the system. The test is designed to run in a period of 30 to 60 minutes. 3.2 Data Characteristics of the data for this scenario: SAS data sets and text files. Row counts up to 90 million. Variable counts up to 297. 1.1 GB total input/output data. File sizes ranging from several kilobytes to 30 GB in size. Data volumes were designed to be larger than the hardware cache in order to place realistic stress on the hardware and operating system file cache. Note: The SAS benchmarking scenarios are designed to replicate a typical Foundation SAS customer s resource use. However, particular customer applications can vary greatly depending on tasks, PROCs used, data volumes and other customer requirements. 13 www.redhat.com

4 Performance Results CPU and I/O utilization patterns for the 32- and 64-core tests were similar. This was expected, as the 64-core workload and CPU resources were doubled from the 32-core workload. CPU utilization ranged from 5 to 95 percent due to the nature of the workload and the way users come and go from the system. Analysis of the data focused on the workload response and run times of both the entire scenario and the sum of all job run times used in each scenario. Below is a table showing the various statistics and how much scalability was achieved going from the 32-core to the 64-core workload and hardware configuration. Cores Sum of all jobs run time in seconds Scenario time (all clock in seconds) Average job run time I/O rates MBps Peak Sustained 64 62,769 2,100 145.30 1,818 1,420 32 31,720 2,040 146.85 1,158 885 The following features contributed to the performance and scalability of the configuration: Intel Advanced Programmable Interrupt Controller (APIC) for optimized Non- Uniform Memory Access (NUMA) Support for Red Hat Enterprise Linux on the new multi-socket NEC Express5800/A1080a server NEC NUMA aware BIOS significant performance gains Red Hat Enterprise Linux processor scheduler automatically optimizes SAS application processes Intel Turbo Boost Technology and Intel Hyper-Threading Technology improved scalability. www.redhat.com 14

Figure 4: CPU Utilization (64-Core, 256 GB, XFS) Figure 5: CPU Utilization (32-Core, 128 GB, XFS) 15 www.redhat.com

Figure 6: I/O Throughput (64-Core, 256 GB, XFS) www.redhat.com 16

Figure 7: I/O Throughput (32-Core, 128 GB, XFS) Figure 8: Comparison of 64-Core & 32-Core I/O 17 www.redhat.com

4.1 Performance Effects of NUMA In general, servers with a Non Uniform Memory Architecture (NUMA) configuration, whose BIOS and operating system recognize the hardware layout perform best. Memory allocation policies with NUMA tend to try an allocate on the node where the process is scheduled. When the environment (BIOS/OS) does not properly support the NUMA configuration or has it disabled, memory is typically interleaved across all NUMA nodes. While performance may be more predictable, most of the memory accesses are on a remote node which increases latency. Memory accesses to the remote nodes occur: 3/4 of the time for the 4 socket 32 core configuration 7/8 of the time for the 8 socket 64 core configuration Since not all nodes are guaranteed to be directly connected to each other, the latency cost depends on how many nodes (hops) the memory request must travel through. The 4 socket 32 core configuration provides single hop connectivity to each node. In the 8 socket configuration, half of the nodes are a single hop away while the other half require two hops. For 4-node/32-core configuration there is only a 1-hop latency. Thus, there was a performance improvement of 11% (hw cost of 1-hop=X) if the scheduler can use local memory for the 216 jobs. For the 8-node/64-core configuration there is a variable number of hops (1-hop at X, and 2-hops Y). Thus the performance improvement rose to over 50% by allowing the scheduler to localize a good part of the memory references running up to 432 jobs at the same wall clock time that it took to run 1/2 the jobs on the 4- node/32-core configuration. Without NUMA in the OS, there would be a significant degradation in scaling. The importance of NUMA support in the 8 socket 64 core configuration is shown below. Without NUMA enabled, the higher ratio of remote accesses coupled with the multiple hops reduce the performance throughput significantly. www.redhat.com 18

Figure 9: Performance Effects of NUMA 19 www.redhat.com

4.2 RHEL Observations & Tuning RHEL 5.4z used for testing and recommended for any Intel Xeon processor 7500 series-based system. RHEL tuning best practices applied using Ktune* SELinux and unneeded services were disabled via chkconfig and sysctcl parameters NUMA (non-uniform memory access) is RHEL 5 default on NEC and all systems based on Intel processors 6000 and 7000 series. Check BIOS to ensure NUMA is enabled or disable memory interleave The I/O output from 8-Socket server exceeded the I/O capacity of test storage configuration. Server utilization would have been significantly greater with more storage controllers. o 1.78 GB/sec peak and 1.39 GB/sec sustained throughput for 64-Core runs o 1.13 GB/sec peak and 885 MB/sec sustained throughput for 32-Core runs Red Hat Enterprise Linux Tunable I/0 stack is essential to SAS performance o Tuned read-ahead on Logical Volume Manager (LVM) devices adjusted to 8192 bytes o Standard blockdev tool to adjust for large sequential access to LVM/filesystem XFS is best suited for large sequential I/O in the SAS Business Analytics workload o 30% improvement in performance over ext3 o Significant reduction is system CPU resources thus providing more compute cycles for more SAS jobs o Support for larger file extents, essential for very large files and provides significantly less file fragmentation and faster file deletion. SAS creates and deletes lots of files. www.redhat.com 20

4.3 SAS Tuning Guidelines Follow best practices for configuring I/O: Split permanent SAS data files, SAS WORK files, and SAS UTILLOC files into separate file systems Insure you have enough IO bandwidth to support the SAS application requirements Set the IO transfer unit size on storage array to match SAS BUFSIZE value Increase SAS BUFSIZE value if you are doing large volumes of IO 21 www.redhat.com

5 Conclusions The SAS 9.2 mixed analytic workload running on Red Hat Enterprise Linux results highlight: Test results for both computational and mixed analytic benchmarking scenarios. Linear scalability when workload and CPU resources doubled. 32- and 64-core scenarios executed. 206 and 412 job mixed computation and I/O-intensive scenario run in 35 minutes or less. Sustained I/O rates of over 1.4 gigabytes per second during 64-core workload. Peak I/O rates over 1.8 gigabytes per second during 64-core workload. Sustained I/O between 17 and 23 MBps per core during test scenario execution. www.redhat.com 22