The Impact of Memory Subsystem Resource Sharing on Datacenter Applications. Lingia Tang Jason Mars Neil Vachharajani Robert Hundt Mary Lou Soffa
|
|
- Edwin Byrd
- 8 years ago
- Views:
Transcription
1 The Impact of Memory Subsystem Resource Sharing on Datacenter Applications Lingia Tang Jason Mars Neil Vachharajani Robert Hundt Mary Lou Soffa
2 Introduction Problem Recent studies into the effects of memory resource sharing among threads concentrated on common multithread workload benchmarks (SPEC, etc). These studies do not represent the workloads that have emerged in datacenter applications. Real-world datacenter applications
3 Introduction Demonstrates the impact of memory sharing in datacenter applications Importance of a good Thread-to-Core (TTC) mapping Evaluate the impact of co-locating threads Optimal TTC mappings changes with co-located applications Identify characteristics that impact performance in various TTC mappings Amount of sharing between threads Required memory bandwidth Application s cache footprint
4 Background and Motivation Memory Resource Sharing Sharing Configurations can vary within the same set of cores
5 Background and Motivation Datacenter Job Scheduling Global job scheduler selects appropriate machine Machine s OS then selects TTC mappings Memory resource sharing is not considered in scheduling decisions Job Priority and Co-location Latency-sensitive applications take a higher priority High priority jobs are scheduled with lower-priority jobs for efficiency Managing QoS priorities on multicores remains a challenge
6 Intra-application Sharing Experiment Methodology Platform: Intel Clovertown (Xeon E5345) dual-socket Linux kernel with a custom GCC Latency-sensitive applications run alone or with a batch application A load generator tests peak behavior with real-world query traces Experiments show performance variability among different TTC mappings Three experimental runs of a fixed load executing across four cores Measured application s specified performance metric
7 Intra-application Sharing Measurement and Findings Performance impact of resource sharing is significant Although each has a different sharing preference bigtable actually performs better when sharing resources!!
8 Intra-application Sharing Performance Variability Last Level Cache Misses Normalized to {X.X.X.X.} Fixed code section Results are consistent with performance trend More LLC misses leads to performance degradation
9 Intra-application Sharing Performance Variability FSB Bandwidth Consumption Normalized to {X.X.X.X.} Fixed code section Results consistent with performance trend More LLC misses increases the number of bus requests
10 Intra-application Sharing Performance Variability Data Sharing Five States of L2 Requests Prefetch, Modified, Exclusive, Shared, and Invalid Purple shows amount of sharing between threads Observations consistent with performance trend
11 Intra-application Sharing Performance Variability Summary LLC sharing has significant positive or negative performance effects Can add up to 10% variability Bus contention also has significant effects Can add another 10% of variability Applications with a high level of data sharing will benefit significantly from sharing the LLC and FSB The results demonstrate the importance of a good TTC mapping that mimics the applications inherent data sharing pattern
12 Inter-application Sharing Experiment Design Same setup as the previous experiment, except this experiment considers the impact of co-located applications with varying TTC mappings Performance of contentanalyzer, bigtable, and websearch are measured while sharing resources with co-running batch applications Co-running applications are stitcher and protobuf Each is a batch, low-priority application
13 Inter-application Sharing Measurement and Findings Normalized to solo-performance under the corresponding TTC mapping The TTC mappings have different effects when a co-runner (*) is present {X*X*X*X*} shares LLC, FSB, and memory subsystem {XX**XX**} shares the FSB and memory subsystem {XXXX****} shares only the memory subsystem (unavoidable)
14 Inter-application Sharing Measurement and Findings Normalized to solo-performance under the TTC mapping {X.X.X. X.} The TTC mappings have different effects when a co-runner (*) is present {X*X*X*X*} shares LLC, FSB, and memory subsystem {XX**XX**} shares the FSB and memory subsystem {XXXX****} shares only the memory subsystem (unavoidable)
15 Inter-application Sharing Measurement and Findings Summary The impact of sharing the LLC and FSB are effected by the specific co-running application Performance swings between best and worst TTC mapping can be very significant The optimal TTC mapping for an application may change with the specific co-running application chosen An intelligent TTC mapping system will account for underlying resources and sharing configurations
16 Varying Thread Count and Architecture Varying the Number of Threads Latency-sensitive applications run with 2 threads Co-located batch applications run with 6 threads Normalized to applications running solo with TTC mapping {X X } Results are consistent with 4-thread behavior
17 Varying Thread Count and Architecture Varying the Number of Threads Latency-sensitive applications run with 6 threads Co-located batch applications run with 2 threads Normalized to applications running solo with TTC mapping {X X } Results are consistent with 4-thread behavior
18 Varying Thread Count and Architecture Varying Architecture Experimented using Intel s Westmere (Xeon X5660) dual-socket platform Each socket has six cores with a 12MB shared LLC Shared memory controller with 3 channels of 8.5 GB/s bus to DIMM Key application running solo with 2 threads Normalized to the TTC mapping {X...X...}
19 Varying Thread Count and Architecture Varying Architecture Experimented using Intel s Westmere (Xeon X5660) dual-socket platform Each socket has six cores with a 12MB shared LLC Shared memory controller with 3 channels of 8.5 GB/s bus to DIMM Key application running solo with 6 threads Normalized to the TTC mapping {XXX...XXX...}
20 Varying Thread Count and Architecture Varying Architecture Experimented using Intel s Westmere (Xeon X5660) dual-socket platform Each socket has six cores with a 12MB shared LLC Shared memory controller with 3 channels of 8.5 GB/s bus to DIMM Key application running with 6 threads Batch application running with 6 threads Normalized to the TTC mapping {XXX...XXX...}
21 Thread-To-Core Mapping Memory Bandwidth Usage Hypothesis: Is the amount of bus bandwidth usage an indicator for its proper FSB sharing configuration?? Yes, FSB demand correlates with the optimal TTC mapping found High FSB demand applications significantly impact performance
22 Thread-To-Core Mapping Data Sharing The percentage of an application s cache lines in the share state can indicate an applications level of data sharing Measured LLC misses, shared LLC references, and other LLC hits
23 Thread-To-Core Mapping Cache Footprint Contention occurs when the total size of multiple threads cache footprints is larger than the cache itself LLC miss-rate is a good indicator of the footprint size Predicts potential performance degradation for co-runners
24 Thread-To-Core Mapping A Heuristic Approach to TTC Mapping An algorithm to predict optimal TTC mappings based on resource usage characteristics of an application (bus and cache usage, and data sharing) Avoids co-locating threads with the same resource bottlenecks while maximizing potential benefits from sharing resources
25 Thread-To-Core Mapping A Heuristic Approach to TTC Mapping Evaluating the Heuristics Limitations: Algorithm is hardware specific Characteristic profiles necessary for each application Characteristic metrics may not accurately represent the application Advantages: Effective and requires simple runtime support Evaluation:
26 Thread-To-Core Mapping An Adaptive Approach to TTC Mapping AToM Adaptive Thread-to-core Mapping Adaptive learning implementation that uses a competition heuristic Searches for optimal TTC assignment for a given set of threads Learning Phase Various mappings compete for the best performance Execution Phase Winner of learning phase is executed for some time period
27 Thread-To-Core Mapping An Adaptive Approach to TTC Mapping Evaluating AToM Applications configured to run on 4 cores Learning phase for 3 runs of 10 minutes each Execution phase for 1 run of 2 hours Performance normalized to performance under TTC mapping {X...X...}
28 Conclusion Outlined the importance of intelligent TTC Mapping decisions The optimal TTC mapping varies with the co-running application Presented key characteristics that impact TTC mappings Presented heuristics for TTC mapping using characteristics Presented a more attractive adaptive approach Simple to implement and very effective for long-running applications Discovered a performance swing of up to 40% A 1% improvement could save a datacenter millions $$ Using adaptive approach Datacenter workloads could improve 22% Perform within 3% of optimal mapping on average
29 Discussion Any questions???
30 Related Work Similar datacenter application studies All study effects of application characteristics On datacenter system design In combination with the underlying architecture On database workloads Architecture Studies investigate memory resource sharing and contention New architecture supports cache and memory bus management Software (Operating systems) Techniques focus on co-scheduling to avoid resource contention Control execution rate using hardware features or a runtime system Cache and contention aware schedulers and compilers
The Impact of Memory Subsystem Resource Sharing on Datacenter Applications
The Impact of Memory Subsystem Resource Sharing on Datacenter Applications Neil Vachharajani Pure Storage neil@purestorage.com Lingjia Tang University of Virginia lt8f@cs.virginia.edu Robert Hundt Google
More informationReSense: Mapping Dynamic Workloads of Colocated Multithreaded Applications Using Resource Sensitivity
ReSense: Mapping Dynamic Workloads of Colocated Multithreaded Applications Using Resource Sensitivity TANIMA DEY, WEI WANG, JACK W. DAVIDSON, and MARY LOU SOFFA, University of Virginia To utilize the full
More informationPerformance Analysis of Thread Mappings with a Holistic View of the Hardware Resources
Performance Analysis of Thread Mappings with a Holistic View of the Hardware Resources Wei Wang, Tanima Dey, Jason Mars, Lingjia Tang, Jack Davidson, Mary Lou Soffa Department of Computer Science University
More informationOptimizing Shared Resource Contention in HPC Clusters
Optimizing Shared Resource Contention in HPC Clusters Sergey Blagodurov Simon Fraser University Alexandra Fedorova Simon Fraser University Abstract Contention for shared resources in HPC clusters occurs
More informationIT@Intel. Comparing Multi-Core Processors for Server Virtualization
White Paper Intel Information Technology Computer Manufacturing Server Virtualization Comparing Multi-Core Processors for Server Virtualization Intel IT tested servers based on select Intel multi-core
More informationAddressing Shared Resource Contention in Multicore Processors via Scheduling
Addressing Shared Resource Contention in Multicore Processors via Scheduling Sergey Zhuravlev Sergey Blagodurov Alexandra Fedorova School of Computing Science, Simon Fraser University, Vancouver, Canada
More informationEnergy-aware Memory Management through Database Buffer Control
Energy-aware Memory Management through Database Buffer Control Chang S. Bae, Tayeb Jamel Northwestern Univ. Intel Corporation Presented by Chang S. Bae Goal and motivation Energy-aware memory management
More informationBubble-Up: Increasing Utilization in Modern Warehouse Scale Computers via Sensible Co-locations
Bubble-Up: Increasing Utilization in Modern Warehouse Scale Computers via Sensible Co-locations Robert Hundt Google rhundt@google.com Jason Mars University of Virginia jom5x@cs.virginia.edu Kevin Skadron
More informationOnline Adaptation for Application Performance and Efficiency
Online Adaptation for Application Performance and Efficiency A Dissertation Proposal by Jason Mars 20 November 2009 Submitted to the graduate faculty of the Department of Computer Science at the University
More informationMeasuring Interference Between Live Datacenter Applications
Measuring Interference Between Live Datacenter Applications Melanie Kambadur Columbia University melanie@cs.columbia.edu Tipp Moseley Google, Inc. tipp@google.com Rick Hank Google, Inc. rhank@google.com
More information7 Real Benefits of a Virtual Infrastructure
7 Real Benefits of a Virtual Infrastructure Dell September 2007 Even the best run IT shops face challenges. Many IT organizations find themselves with under-utilized servers and storage, yet they need
More informationThe Mainframe Virtualization Advantage: How to Save Over Million Dollars Using an IBM System z as a Linux Cloud Server
Research Report The Mainframe Virtualization Advantage: How to Save Over Million Dollars Using an IBM System z as a Linux Cloud Server Executive Summary Information technology (IT) executives should be
More informationTRACE PERFORMANCE TESTING APPROACH. Overview. Approach. Flow. Attributes
TRACE PERFORMANCE TESTING APPROACH Overview Approach Flow Attributes INTRODUCTION Software Testing Testing is not just finding out the defects. Testing is not just seeing the requirements are satisfied.
More informationMulti-Threading Performance on Commodity Multi-Core Processors
Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction
More informationOracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011
Oracle Database Reliability, Performance and scalability on Intel platforms Mitch Shults, Intel Corporation October 2011 1 Intel Processor E7-8800/4800/2800 Product Families Up to 10 s and 20 Threads 30MB
More informationDelivering Quality in Software Performance and Scalability Testing
Delivering Quality in Software Performance and Scalability Testing Abstract Khun Ban, Robert Scott, Kingsum Chow, and Huijun Yan Software and Services Group, Intel Corporation {khun.ban, robert.l.scott,
More informationOn the Importance of Thread Placement on Multicore Architectures
On the Importance of Thread Placement on Multicore Architectures HPCLatAm 2011 Keynote Cordoba, Argentina August 31, 2011 Tobias Klug Motivation: Many possibilities can lead to non-deterministic runtimes...
More informationAudit & Tune Deliverables
Audit & Tune Deliverables The Initial Audit is a way for CMD to become familiar with a Client's environment. It provides a thorough overview of the environment and documents best practices for the PostgreSQL
More informationPerformance Analysis of Web based Applications on Single and Multi Core Servers
Performance Analysis of Web based Applications on Single and Multi Core Servers Gitika Khare, Diptikant Pathy, Alpana Rajan, Alok Jain, Anil Rawat Raja Ramanna Centre for Advanced Technology Department
More informationBubble-Flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers
Bubble-Flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers Hailong Yang Alex Breslow Jason Mars Lingjia Tang University of California, San Diego {h5yang, abreslow,
More informationMeasuring Cache and Memory Latency and CPU to Memory Bandwidth
White Paper Joshua Ruggiero Computer Systems Engineer Intel Corporation Measuring Cache and Memory Latency and CPU to Memory Bandwidth For use with Intel Architecture December 2008 1 321074 Executive Summary
More informationFACT: a Framework for Adaptive Contention-aware Thread migrations
FACT: a Framework for Adaptive Contention-aware Thread migrations Kishore Kumar Pusukuri Department of Computer Science and Engineering University of California, Riverside, CA 92507. kishore@cs.ucr.edu
More informationMemory Access Control in Multiprocessor for Real-time Systems with Mixed Criticality
Memory Access Control in Multiprocessor for Real-time Systems with Mixed Criticality Heechul Yun +, Gang Yao +, Rodolfo Pellizzoni *, Marco Caccamo +, Lui Sha + University of Illinois at Urbana and Champaign
More informationAchieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.
More informationAccelerating Business Intelligence with Large-Scale System Memory
Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness
More informationHP ProLiant Gen8 vs Gen9 Server Blades on Data Warehouse Workloads
HP ProLiant Gen8 vs Gen9 Server Blades on Data Warehouse Workloads Gen9 Servers give more performance per dollar for your investment. Executive Summary Information Technology (IT) organizations face increasing
More informationMulti-core and Linux* Kernel
Multi-core and Linux* Kernel Suresh Siddha Intel Open Source Technology Center Abstract Semiconductor technological advances in the recent years have led to the inclusion of multiple CPU execution cores
More informationOracle Database Scalability in VMware ESX VMware ESX 3.5
Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises
More informationPerformance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers
WHITE PAPER FUJITSU PRIMERGY AND PRIMEPOWER SERVERS Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers CHALLENGE Replace a Fujitsu PRIMEPOWER 2500 partition with a lower cost solution that
More informationImpact of Java Application Server Evolution on Computer System Performance
Impact of Java Application Server Evolution on Computer System Performance Peng-fei Chuang, Celal Ozturk, Khun Ban, Huijun Yan, Kingsum Chow, Resit Sendag Intel Corporation; {peng-fei.chuang, khun.ban,
More informationFPGA-based Multithreading for In-Memory Hash Joins
FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded
More informationBig Data Technologies for Ultra-High-Speed Data Transfer and Processing
White Paper Intel Xeon Processor E5 Family Big Data Analytics Cloud Computing Solutions Big Data Technologies for Ultra-High-Speed Data Transfer and Processing Using Technologies from Aspera and Intel
More informationViolin Memory 7300 Flash Storage Platform Supports Multiple Primary Storage Workloads
Violin Memory 7300 Flash Storage Platform Supports Multiple Primary Storage Workloads Web server, SQL Server OLTP, Exchange Jetstress, and SharePoint Workloads Can Run Simultaneously on One Violin Memory
More informationPerformance Characteristics of Large SMP Machines
Performance Characteristics of Large SMP Machines Dirk Schmidl, Dieter an Mey, Matthias S. Müller schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) Agenda Investigated Hardware Kernel Benchmark
More informationPerformance Tuning and Optimizing SQL Databases 2016
Performance Tuning and Optimizing SQL Databases 2016 http://www.homnick.com marketing@homnick.com +1.561.988.0567 Boca Raton, Fl USA About this course This four-day instructor-led course provides students
More informationTableau Server 7.0 scalability
Tableau Server 7.0 scalability February 2012 p2 Executive summary In January 2012, we performed scalability tests on Tableau Server to help our customers plan for large deployments. We tested three different
More informationInnovativste XEON Prozessortechnik für Cisco UCS
Innovativste XEON Prozessortechnik für Cisco UCS Stefanie Döhler Wien, 17. November 2010 1 Tick-Tock Development Model Sustained Microprocessor Leadership Tick Tock Tick 65nm Tock Tick 45nm Tock Tick 32nm
More informationHow To Monitor Performance On A Microsoft Powerbook (Powerbook) On A Network (Powerbus) On An Uniden (Powergen) With A Microsatellite) On The Microsonde (Powerstation) On Your Computer (Power
A Topology-Aware Performance Monitoring Tool for Shared Resource Management in Multicore Systems TADaaM Team - Nicolas Denoyelle - Brice Goglin - Emmanuel Jeannot August 24, 2015 1. Context/Motivations
More informationCHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION 1.1 MOTIVATION OF RESEARCH Multicore processors have two or more execution cores (processors) implemented on a single chip having their own set of execution and architectural recourses.
More informationDIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
More informationVendor and Hardware Platform: Fujitsu BX924 S2 Virtualization Platform: VMware ESX 4.0 Update 2 (build 261974)
Vendor and Hardware Platform: Fujitsu BX924 S2 Virtualization Platform: VMware ESX 4.0 Update 2 (build 261974) Performance Section Performance Tested By: Fujitsu Test Date: 10-05-2010 Configuration Section
More informationA Quantum Leap in Enterprise Computing
A Quantum Leap in Enterprise Computing Unprecedented Reliability and Scalability in a Multi-Processor Server Product Brief Intel Xeon Processor 7500 Series Whether you ve got data-demanding applications,
More informationSIDN Server Measurements
SIDN Server Measurements Yuri Schaeffer 1, NLnet Labs NLnet Labs document 2010-003 July 19, 2010 1 Introduction For future capacity planning SIDN would like to have an insight on the required resources
More informationAchieving QoS in Server Virtualization
Achieving QoS in Server Virtualization Intel Platform Shared Resource Monitoring/Control in Xen Chao Peng (chao.p.peng@intel.com) 1 Increasing QoS demand in Server Virtualization Data center & Cloud infrastructure
More informationSystem Requirements Table of contents
Table of contents 1 Introduction... 2 2 Knoa Agent... 2 2.1 System Requirements...2 2.2 Environment Requirements...4 3 Knoa Server Architecture...4 3.1 Knoa Server Components... 4 3.2 Server Hardware Setup...5
More informationSun 8Gb/s Fibre Channel HBA Performance Advantages for Oracle Database
Performance Advantages for Oracle Database At a Glance This Technical Brief illustrates that even for smaller online transaction processing (OLTP) databases, the Sun 8Gb/s Fibre Channel Host Bus Adapter
More informationSummary. Key results at a glance:
An evaluation of blade server power efficiency for the, Dell PowerEdge M600, and IBM BladeCenter HS21 using the SPECjbb2005 Benchmark The HP Difference The ProLiant BL260c G5 is a new class of server blade
More informationEvaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation
Evaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation Evaluation report prepared under contract with HP Executive Summary The computing industry is experiencing an increasing demand for
More informationRackspace Cloud Databases and Container-based Virtualization
Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many
More informationFull and Para Virtualization
Full and Para Virtualization Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF x86 Hardware Virtualization The x86 architecture offers four levels
More informationQuad-Core Intel Xeon Processor
Product Brief Intel Xeon Processor 7300 Series Quad-Core Intel Xeon Processor 7300 Series Maximize Performance and Scalability in Multi-Processor Platforms Built for Virtualization and Data Demanding Applications
More informationAn OS-oriented performance monitoring tool for multicore systems
An OS-oriented performance monitoring tool for multicore systems J.C. Sáez, J. Casas, A. Serrano, R. Rodríguez-Rodríguez, F. Castro, D. Chaver, M. Prieto-Matias Department of Computer Architecture Complutense
More informationIntelligent Heuristic Construction with Active Learning
Intelligent Heuristic Construction with Active Learning William F. Ogilvie, Pavlos Petoumenos, Zheng Wang, Hugh Leather E H U N I V E R S I T Y T O H F G R E D I N B U Space is BIG! Hubble Ultra-Deep Field
More informationHost Power Management in VMware vsphere 5.5
in VMware vsphere 5.5 Performance Study TECHNICAL WHITE PAPER Table of Contents Introduction...3 Power Management BIOS Settings...3 Host Power Management in ESXi 5.5... 5 Relationship between HPM and DPM...
More informationExploring Multi-Threaded Java Application Performance on Multicore Hardware
Exploring Multi-Threaded Java Application Performance on Multicore Hardware Jennifer B. Sartor and Lieven Eeckhout Ghent University, Belgium Abstract While there have been many studies of how to schedule
More informationStreaming and Virtual Hosted Desktop Study: Phase 2
IT@Intel White Paper Intel Information Technology Computing Models April 1 Streaming and Virtual Hosted Desktop Study: Phase 2 Our current findings indicate that streaming provides better server loading
More informationProven Performance for Accenture Duck Creek Policy Administration Commercial Lines
Proven Performance for Accenture Duck Creek Policy Administration Commercial Lines Benchmark testing confirms scalability and performance of Microsoft SQL Server 2012 and servers based on the Intel Xeon
More informationVirtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies
Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies Kurt Klemperer, Principal System Performance Engineer kklemperer@blackboard.com Agenda Session Length:
More informationLeading Virtualization Performance and Energy Efficiency in a Multi-processor Server
Leading Virtualization Performance and Energy Efficiency in a Multi-processor Server Product Brief Intel Xeon processor 7400 series Fewer servers. More performance. With the architecture that s specifically
More informationProven Performance for Accenture Duck Creek Policy Administration Commercial Lines
Proven Performance for Accenture Duck Creek Policy Administration Commercial Lines Benchmark testing confirms scalability and performance of Microsoft SQL Server 2012 and servers based on the Intel Xeon
More information8Gb Fibre Channel Adapter of Choice in Microsoft Hyper-V Environments
8Gb Fibre Channel Adapter of Choice in Microsoft Hyper-V Environments QLogic 8Gb Adapter Outperforms Emulex QLogic Offers Best Performance and Scalability in Hyper-V Environments Key Findings The QLogic
More informationAccelerating Business Intelligence with Large-Scale System Memory
Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness
More informationDYNAMIC CACHE-USAGE PROFILER FOR THE XEN HYPERVISOR WIRA DAMIS MULIA. Bachelor of Science in Electrical and Computer. Engineering
DYNAMIC CACHE-USAGE PROFILER FOR THE XEN HYPERVISOR By WIRA DAMIS MULIA Bachelor of Science in Electrical and Computer Engineering Oklahoma State University Stillwater, Oklahoma 2009 Submitted to the Faculty
More informationHow To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)
Scalability Results Select the right hardware configuration for your organization to optimize performance Table of Contents Introduction... 1 Scalability... 2 Definition... 2 CPU and Memory Usage... 2
More informationParallel Programming Survey
Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory
More informationRED HAT ENTERPRISE VIRTUALIZATION PERFORMANCE: SPECVIRT BENCHMARK
RED HAT ENTERPRISE VIRTUALIZATION PERFORMANCE: SPECVIRT BENCHMARK AT A GLANCE The performance of Red Hat Enterprise Virtualization can be compared to other virtualization platforms using the SPECvirt_sc2010
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis 1 / 39 Overview Overview Overview What is a Workload? Instruction Workloads Synthetic Workloads Exercisers and
More informationRemoving Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC
More informationEnabling Technologies for Distributed Computing
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies
More informationCloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms
CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms Rodrigo N. Calheiros, Rajiv Ranjan, Anton Beloglazov, César A. F. De Rose,
More informationAnalyzing the Virtualization Deployment Advantages of Two- and Four-Socket Server Platforms
IT@Intel White Paper Intel IT IT Best Practices: Data Center Solutions Server Virtualization August 2010 Analyzing the Virtualization Deployment Advantages of Two- and Four-Socket Server Platforms Executive
More informationHP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief
Technical white paper HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Scale-up your Microsoft SQL Server environment to new heights Table of contents Executive summary... 2 Introduction...
More informationVirtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM
White Paper Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM September, 2013 Author Sanhita Sarkar, Director of Engineering, SGI Abstract This paper describes how to implement
More informationBest Practices. Server: Power Benchmark
Best Practices Server: Power Benchmark Rising global energy costs and an increased energy consumption of 2.5 percent in 2011 is driving a real need for combating server sprawl via increased capacity and
More informationVirtualization Performance Analysis November 2010 Effect of SR-IOV Support in Red Hat KVM on Network Performance in Virtualized Environments
Virtualization Performance Analysis November 2010 Effect of SR-IOV Support in Red Hat KVM on Network Performance in Steve Worley System x Performance Analysis and Benchmarking IBM Systems and Technology
More informationDynamic Virtual Machine Scheduling in Clouds for Architectural Shared Resources
Dynamic Virtual Machine Scheduling in Clouds for Architectural Shared Resources JeongseobAhn,Changdae Kim, JaeungHan,Young-ri Choi,and JaehyukHuh KAIST UNIST {jeongseob, cdkim, juhan, and jhuh}@calab.kaist.ac.kr
More informationMAGENTO HOSTING Progressive Server Performance Improvements
MAGENTO HOSTING Progressive Server Performance Improvements Simple Helix, LLC 4092 Memorial Parkway Ste 202 Huntsville, AL 35802 sales@simplehelix.com 1.866.963.0424 www.simplehelix.com 2 Table of Contents
More informationModeling the Effects on Power and Performance from Memory Interference of Co-located Applications in Multicore Systems
Modeling the Effects on Power and Performance from Memory Interference of Co-located Applications in Multicore Systems Daniel Dauwe 1, Ryan Friese 1, Sudeep Pasricha 1,2, Anthony A. Maciejewski 1, Gregory
More informationParallel Algorithm Engineering
Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework Examples Software crisis
More informationParallel Processing and Software Performance. Lukáš Marek
Parallel Processing and Software Performance Lukáš Marek DISTRIBUTED SYSTEMS RESEARCH GROUP http://dsrg.mff.cuni.cz CHARLES UNIVERSITY PRAGUE Faculty of Mathematics and Physics Benchmarking in parallel
More informationCarlos Villavieja, Nacho Navarro {cvillavi,nacho}@ac.upc.edu. Arati Baliga, Liviu Iftode {aratib,liviu}@cs.rutgers.edu
Continuous Monitoring using MultiCores Carlos Villavieja, Nacho Navarro {cvillavi,nacho}@ac.upc.edu Arati Baliga, Liviu Iftode {aratib,liviu}@cs.rutgers.edu Motivation Intrusion detection Intruder gets
More informationInfrastructure Matters: POWER8 vs. Xeon x86
Advisory Infrastructure Matters: POWER8 vs. Xeon x86 Executive Summary This report compares IBM s new POWER8-based scale-out Power System to Intel E5 v2 x86- based scale-out systems. A follow-on report
More informationImproving the performance of data servers on multicore architectures. Fabien Gaud
Improving the performance of data servers on multicore architectures Fabien Gaud Grenoble University Advisors: Jean-Bernard Stefani, Renaud Lachaize and Vivien Quéma Sardes (INRIA/LIG) December 2, 2010
More informationOptimizing SQL Server Storage Performance with the PowerEdge R720
Optimizing SQL Server Storage Performance with the PowerEdge R720 Choosing the best storage solution for optimal database performance Luis Acosta Solutions Performance Analysis Group Joe Noyola Advanced
More informationOptimizing Web Infrastructure on Intel Architecture
White Paper Intel Processors for Web Architectures Optimizing Web Infrastructure on Intel Architecture Executive Summary and Purpose of this Paper Today s data center infrastructures must adapt to mobile
More informationOptimization of Cluster Web Server Scheduling from Site Access Statistics
Optimization of Cluster Web Server Scheduling from Site Access Statistics Nartpong Ampornaramveth, Surasak Sanguanpong Faculty of Computer Engineering, Kasetsart University, Bangkhen Bangkok, Thailand
More informationHardware performance monitoring. Zoltán Majó
Hardware performance monitoring Zoltán Majó 1 Question Did you take any of these lectures: Computer Architecture and System Programming How to Write Fast Numerical Code Design of Parallel and High Performance
More informationPower Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure
White Paper Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure White Paper March 2014 2014 Cisco and/or its affiliates. All rights reserved. This
More informationHUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server
HUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server HUAWEI FusionServer X6800 Data Center Server Data Center Cloud Internet App Big Data HPC As the IT infrastructure changes with
More informationDelivering 160Gbps DPI Performance on the Intel Xeon Processor E5-2600 Series using HyperScan
SOLUTION WHITE PAPER Intel processors Pattern Matching Library Software Delivering 160Gbps DPI Performance on the Intel Xeon Processor E5-2600 Series using HyperScan HyperScan s runtime is engineered for
More informationDetailed Characterization of HPC Applications for Co-Scheduling
Detailed Characterization of HPC Applications for Co-Scheduling ABSTRACT Josef Weidendorfer Department of Informatics, Chair for Computer Architecture Technische Universität München weidendo@in.tum.de
More informationEnabling Technologies for Distributed and Cloud Computing
Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading
More informationMemory Performance at Reduced CPU Clock Speeds: An Analysis of Current x86 64 Processors
Memory Performance at Reduced CPU Clock Speeds: An Analysis of Current x86 64 Processors Robert Schöne, Daniel Hackenberg, and Daniel Molka Center for Information Services and High Performance Computing
More informationRun-time Resource Management in SOA Virtualized Environments. Danilo Ardagna, Raffaela Mirandola, Marco Trubian, Li Zhang
Run-time Resource Management in SOA Virtualized Environments Danilo Ardagna, Raffaela Mirandola, Marco Trubian, Li Zhang Amsterdam, August 25 2009 SOI Run-time Management 2 SOI=SOA + virtualization Goal:
More informationOptimizing Virtual Machine Scheduling in NUMA Multicore Systems
Optimizing Virtual Machine Scheduling in NUMA Multicore Systems Jia Rao, Kun Wang, Xiaobo Zhou, Cheng-Zhong Xu Dept. of Computer Science Dept. of Electrical and Computer Engineering University of Colorado,
More informationSQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V
SQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V White Paper July 2011 Contents Executive Summary... 3 Introduction... 3 Audience and Scope... 4 Today s Challenges...
More informationMAQAO Performance Analysis and Optimization Tool
MAQAO Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL andres.charif@uvsq.fr Performance Evaluation Team, University of Versailles S-Q-Y http://www.maqao.org VI-HPS 18 th Grenoble 18/22
More informationManaged Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures
Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures Ada Gavrilovska Karsten Schwan, Mukil Kesavan Sanjay Kumar, Ripal Nathuji, Adit Ranadive Center for Experimental
More informationPerformance Characteristics of VMFS and RDM VMware ESX Server 3.0.1
Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System
More informationVirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5
Performance Study VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5 VMware VirtualCenter uses a database to store metadata on the state of a VMware Infrastructure environment.
More information