Challenges for synchronization and scalability on manycore: a Software Transactional Memory approach

Size: px
Start display at page:

Download "Challenges for synchronization and scalability on manycore: a Software Transactional Memory approach"

Transcription

1 Challenges for synchronization and scalability on manycore: a Software Transactional Memory approach Maurício Lima Pilla André Rauber Du Bois Adenauer Correa Yamin Ana Marilza Pernas Fleischmann Gerson Geraldo Homrich Cavalheiro Renata Hax Sander Reiser Laboratory of Ubiquitous and Parallel Systems Federal University of Pelotas pilla@inf.ufpel.edu.br March 2015 Manycore of 40

2 INTRODUCTION Manycore of 40

3 Introduction Rise of Multicore/Manycore processors Hard to code Usage of locks becomes complex Deadlocks Software composition Transactional Memories (TM) Access to shared memory inside transactions Similar to those of databanks Manycore of 40

4 TRANSACTIONAL MEMORY S BASICS Manycore of 40

5 Properties Atomicity Consistency Isolation Durability Manycore of 40

6 Transactional Memory Constructs Atomic Declares that operations must be executed in an atomic way Retry If transaction cannot be executed, try again later when variable changes atomic { } i f ( l a s t == 0) r e t r y ; l a s t ; r e t u r n b u f f e r [ l a s t ] ; Manycore of 40

7 TM Constructs OrElse Alternative transaction if first one fails atomic { } { x = b u f f e r 1. remove ( ) ; } orelse { x = b u f f e r 2. remove ( ) ; } Manycore of 40

8 Locks x Transactions Suppose a global counter bool m; / / mutex lock (&m) counter : = counter + 1 unlock(&m) atomic { counter : = counter + 1 } Manycore of 40

9 Composing Software What can go wrong? v = hash1. d e l e t e (A) ; hash2. i n s e r t (A, v ) ; Use an extra lock? What if we insert into another hash, like hash3? Manycore of 40

10 Composing Software with STM The Atomic keyword atomic { } v = hash1. d e l e t e (A) ; hash2. i n s e r t (A, v ) ; Runtime is responsible for making it atomic Manycore of 40

11 PROJECT DIMENSIONS (OR THE KNOBS WE CAN TWIST) Manycore of 40

12 Classification of TMs Originally a hardware concept Hardware Transactional Memory Software Transactional Memory Hibrid Transactional Memory Transactional Synchronization Extensions (TSX) on Intel s Haswell processors Uses cache coherence protocol Bug fixed in later Broadwell processors Manycore of 40

13 Isolation Levels Related to the interaction between transactional and non-transactional codes Strong Atomicity Accesses outside transactions are consistent with accesses inside them Weak Atomicity Accesses to shared data outside transactions may cause race conditions Manycore of 40

14 Implementation Two mechanisms are basic to provide atomicity, consistency, and isolation Conflict Detection and Resolution Detects if two or more transactions conflict (race condition) Then, ideally one of them will succeed Data Versioning Original and speculative data Manycore of 40

15 Conflict Granularity Different granularities for detecting conflicts Objects Less overhead Allows for detection of false conflicts Word More overhead No false conflicts (unless variables are smaller than words) Cache line Compromise Requires hardware support (Intel) False conflicts if variables in the same cache line or even in another Challenges for synchronization set and scalability on manycore: a Software Transactional Memory approach Manycore of 40

16 Conflict Detection Early Detection Also called Pessimistic Everytime there is an access, conflicts are checked If check fails, stop there Late Detection Or Optimistic Verifies for conflicts in the end of the transaction Manycore of 40

17 Early Detection Trans1 read A Trans2 Trans1 read A Trans2 Trans1 read A write A Trans2 time commit write C write B commit disjunct data sets, no conflicts abort commit write A commit conflict and restart of transaction 1 abort abort read A write A read A write A read A write A livelock abort Manycore of 40

18 Late Detection Trans1 read A Trans2 Trans1 write A Trans2 Trans1 read A Trans2 write B read A write A time write C commit commit read A abort commit commit commit commit disjunct data sets, no conflicts conflict and restart of transaction 1 no conflicts Manycore of 40

19 Late Detection Trans1 read A write A read A write A Trans2 Trans3 abort read A write A commit read A write A commit abort conflicts, may cause starvation if transaction keeps getting aborted by shorter ones Manycore of 40

20 Contention Manager Decides what to do in case of conflict Must guarantee progress Timid Avoiding starvation Avoiding livelock (when transactions keep avoiding one to be committed) Aborts transactions that generated a conflict Backoff Freezes a transaction for a while Greedy Oldest transaction continues execution Manycore of 40

21 Data Versioning Eager Versioning Variables are modified in place with speculative data Original values are copied to an undo log Lazy Versioning Speculative data are stored in a buffer Only committed data are stored in the variables Manycore of 40

22 Examples of Data Versioning E A G E R data x:10 undo log transaction writes 77 x:77 x:10 commit x:77 x:10 abort x:10 x:10 Manycore of 40

23 Examples of Data Versioning E A G E R data x:10 undo log transaction writes 77 x:77 x:10 commit x:77 x:10 abort x:10 x:10 commit L A Z Y data x:10 buffer transaction writes 77 x:10 x:77 x:77 x:77 abort x:10 x:77 Manycore of 40

24 STM Libraries STM Versioning Conflict Detection Contention Manager AdaptSTM Eager (low contention) Early Timid (low contention) Lazy (high contention) Expon. backoff (high cont.) SwissTM Lazy Early (write-write) Timid (shor trans. and reads) Late (read-write) Linear backof (write-write) TinySTM Lazy Early Timid TL2 Lazy Late Timid Linear backoff (after 3 aborts) Manycore of 40

25 EXPERIMENTS Manycore of 40

26 Objectives Find out how different STM libraries behave in terms of performance and energy Different contention scenarios Number of threads in excess of cores Scalability trends Manycore of 40

27 Benchmarks STAMP Stanford Transactional Applications for Multi-Processing Minh et al, IISWC 2008 Set of 8 applications Portable Widely used for evaluation of STMs Manycore of 40

28 Characteristics of STAMP Apps Application Transaction Read-write Time in Contention length set transactions Genome Medium Medium High Low Intruder Short Medium Medium High Kmeans Short Small Low Low Labyrinth Long Large High High Vacation Medium Medium High Medium Yada Long Large High Medium Manycore of 40

29 CHARACTERIZATION OF PERFORMANCE AND ENERGY Manycore of 40

30 Simulation Environment SGI Altix with two Xeon E5620 (4 cores plus HT), 6 GB RAM Baseboard Management Controler (BMC) data for energy extracted using FreeIPMI SUSE Linux SP11, GNU G to 64 threads Each experiment was executed 10 times Manycore of 40

31 Execution time (seconds) Energy consumption (Joules) Introduction Basics Project Dimensions Results Current and Future Work genome TL2 TinySTM SwissTM AdaptSTM TL2 TinySTM SwissTM AdaptSTM Threads Threads Manycore of 40

32 Execution time (seconds) Energy consumption (Joules) Introduction Basics Project Dimensions Results Current and Future Work intruder TL2 TinySTM SwissTM AdaptSTM TL2 TinySTM SwissTM AdaptSTM Threads Threads High contention. Manycore of 40

33 Execution time (seconds) Energy consumption (Joules) Introduction Basics Project Dimensions Results Current and Future Work kmeans TL2 TinySTM SwissTM AdaptSTM TL2 TinySTM SwissTM AdaptSTM Threads Threads Manycore of 40

34 Execution time (seconds) Energy consumption (Joules) Introduction Basics Project Dimensions Results Current and Future Work labyrinth TL2 TinySTM SwissTM AdaptSTM TL2 TinySTM SwissTM AdaptSTM Threads Threads Manycore of 40

35 Execution time (seconds) Energy consumption (Joules) Introduction Basics Project Dimensions Results Current and Future Work vacation-high TL2 TinySTM SwissTM AdaptSTM TL2 TinySTM SwissTM AdaptSTM Threads Threads Manycore of 40

36 Execution time (seconds) Energy consumption (Joules) Introduction Basics Project Dimensions Results Current and Future Work yada TL2 TinySTM SwissTM AdaptSTM TL2 TinySTM SwissTM AdaptSTM Threads Threads High contention. Manycore of 40

37 Discussion TL2 presents lowest conflict rate grow Followed by AdaptSTM, SwissTM, TinySTM AdaptSTM shows better scalability for short transactions Exponential backoff is good for higher contention SwissTM is better for longer transactions Greedy contention manager, late conflict detection for r-w TL2 is better for long transactions and more than low contention Late conflict detection TinySTM is efficient for small abort rates Timid contention manager Manycore of 40

38 Discussion TL2 presents lowest conflict rate grow Followed by AdaptSTM, SwissTM, TinySTM AdaptSTM shows better scalability for short transactions Exponential backoff is good for higher contention SwissTM is better for longer transactions Greedy contention manager, late conflict detection for r-w TL2 is better for long transactions and more than low contention Late conflict detection TinySTM is efficient for small abort rates Timid contention manager Borderline No one excels in all cases! Manycore of 40

39 CURRENT AND FUTURE WORK Manycore of 40

40 Current and Future Work on Manycore at UFPEL Optimization of STM libraries for higher number of cores/threads Better benchmarks for higher number of cores? STMs for Distributed Memory/Cloud Study and adaptation of STM behavior over future memory technologies (Phase-Change Memory) Scheduling for manycore (Anahy) Quantum Computing simulation on manycore/gpus/clusters/cloud/toasters Manycore of 40

41 Current and Future Work on Manycore at UFPEL Optimization of STM libraries for higher number of cores/threads Better benchmarks for higher number of cores? STMs for Distributed Memory/Cloud Study and adaptation of STM behavior over future memory technologies (Phase-Change Memory) Scheduling for manycore (Anahy) Quantum Computing simulation on manycore/gpus/clusters/cloud/toasters All about Green Computing Manycore of 40

42 Reviews of this Work Manycore of 40

43 Challenges for synchronization and scalability on manycore: a Software Transactional Memory approach Maurício Lima Pilla André Rauber Du Bois Adenauer Correa Yamin Ana Marilza Pernas Fleischmann Gerson Geraldo Homrich Cavalheiro Renata Hax Sander Reiser Laboratory of Ubiquitous and Parallel Systems Federal University of Pelotas pilla@inf.ufpel.edu.br March 2015 Manycore of 40

SEER PROBABILISTIC SCHEDULING FOR COMMODITY HARDWARE TRANSACTIONAL MEMORY. 27 th Symposium on Parallel Architectures and Algorithms

SEER PROBABILISTIC SCHEDULING FOR COMMODITY HARDWARE TRANSACTIONAL MEMORY. 27 th Symposium on Parallel Architectures and Algorithms 27 th Symposium on Parallel Architectures and Algorithms SEER PROBABILISTIC SCHEDULING FOR COMMODITY HARDWARE TRANSACTIONAL MEMORY Nuno Diegues, Paolo Romano and Stoyan Garbatov Seer: Scheduling for Commodity

More information

Transactional Memory

Transactional Memory Transactional Memory Konrad Lai Microprocessor Technology Labs, Intel Intel Multicore University Research Conference Dec 8, 2005 Motivation Multiple cores face a serious programmability problem Writing

More information

Energy Efficiency of Software Transactional Memory in a Heterogeneous Architecture

Energy Efficiency of Software Transactional Memory in a Heterogeneous Architecture Energy Efficiency of Software Transactional Memory in a Heterogeneous Architecture Emilio Villegas, Alejandro Villegas, Angeles Navarro, Rafael Asenjo, Yash Ukidave, Oscar Plata University of Malaga, Dept.

More information

Performance Evaluation of Adaptivity in Software Transactional Memory

Performance Evaluation of Adaptivity in Software Transactional Memory Performance Evaluation of Adaptivity in Software Transactional Memory Mathias Payer ETH Zurich, Switzerland mathias.payer@inf.ethz.ch Thomas R. Gross ETH Zurich, Switzerland trg@inf.ethz.ch Abstract Transactional

More information

Scalability evaluation of barrier algorithms for OpenMP

Scalability evaluation of barrier algorithms for OpenMP Scalability evaluation of barrier algorithms for OpenMP Ramachandra Nanjegowda, Oscar Hernandez, Barbara Chapman and Haoqiang H. Jin High Performance Computing and Tools Group (HPCTools) Computer Science

More information

Parallel Processing and Software Performance. Lukáš Marek

Parallel Processing and Software Performance. Lukáš Marek Parallel Processing and Software Performance Lukáš Marek DISTRIBUTED SYSTEMS RESEARCH GROUP http://dsrg.mff.cuni.cz CHARLES UNIVERSITY PRAGUE Faculty of Mathematics and Physics Benchmarking in parallel

More information

Tyler Crain. On Improving the Ease of Use of the Software Transactional Memory Abstraction

Tyler Crain. On Improving the Ease of Use of the Software Transactional Memory Abstraction ANNÉE 2013 THÈSE / UNIVERSITÉ DE RENNES 1 sous le sceau de l Université Européenne de Bretagne pour le grade de DOCTEUR DE L UNIVERSITÉ DE RENNES 1 Mention : Informátique Ecole doctorale Matisse présentée

More information

DATABASE CONCURRENCY CONTROL USING TRANSACTIONAL MEMORY : PERFORMANCE EVALUATION

DATABASE CONCURRENCY CONTROL USING TRANSACTIONAL MEMORY : PERFORMANCE EVALUATION DATABASE CONCURRENCY CONTROL USING TRANSACTIONAL MEMORY : PERFORMANCE EVALUATION Jeong Seung Yu a, Woon Hak Kang b, Hwan Soo Han c and Sang Won Lee d School of Info. & Comm. Engr. Sungkyunkwan University

More information

Thesis Proposal: Improving the Performance of Synchronization in Concurrent Haskell

Thesis Proposal: Improving the Performance of Synchronization in Concurrent Haskell Thesis Proposal: Improving the Performance of Synchronization in Concurrent Haskell Ryan Yates 5-5-2014 1/21 Introduction Outline Thesis Why Haskell? Preliminary work Hybrid TM for GHC Obstacles to Performance

More information

Maximum Benefit from a Minimal HTM

Maximum Benefit from a Minimal HTM Maximum Benefit from a Minimal HTM Owen S. Hofmann Christopher J. Rossbach Emmett Witchel University of Texas at Austin osh@cs.utexas.edu, rossbach@cs.utexas.edu, witchel@cs.utexas.edu Abstract A minimal,

More information

Understanding Hardware Transactional Memory

Understanding Hardware Transactional Memory Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence

More information

Cloned Transactions: A New Execution Concept for Transactional Memory

Cloned Transactions: A New Execution Concept for Transactional Memory Cloned Transactions: A New Execution Concept for Transactional Memory Vom Promotionsausschuss der Technischen Universität Hamburg-Harburg zur Erlangung des akademischen Grades Doktor der Naturwissenschaften

More information

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5 Performance Study VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5 VMware VirtualCenter uses a database to store metadata on the state of a VMware Infrastructure environment.

More information

Estimate Performance and Capacity Requirements for Workflow in SharePoint Server 2010

Estimate Performance and Capacity Requirements for Workflow in SharePoint Server 2010 Estimate Performance and Capacity Requirements for Workflow in SharePoint Server 2010 This document is provided as-is. Information and views expressed in this document, including URL and other Internet

More information

Virtuoso and Database Scalability

Virtuoso and Database Scalability Virtuoso and Database Scalability By Orri Erling Table of Contents Abstract Metrics Results Transaction Throughput Initializing 40 warehouses Serial Read Test Conditions Analysis Working Set Effect of

More information

Performance Characteristics of Large SMP Machines

Performance Characteristics of Large SMP Machines Performance Characteristics of Large SMP Machines Dirk Schmidl, Dieter an Mey, Matthias S. Müller schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) Agenda Investigated Hardware Kernel Benchmark

More information

Software and the Concurrency Revolution

Software and the Concurrency Revolution Software and the Concurrency Revolution A: The world s fastest supercomputer, with up to 4 processors, 128MB RAM, 942 MFLOPS (peak). 2 Q: What is a 1984 Cray X-MP? (Or a fractional 2005 vintage Xbox )

More information

COURSE OUTLINE Survey of Operating Systems

COURSE OUTLINE Survey of Operating Systems Butler Community College Career and Technical Education Division Skyler Lovelace New Fall 2014 Implemented Spring 2015 COURSE OUTLINE Survey of Operating Systems Course Description IN 167. Survey of Operating

More information

Introduction to Database Management Systems

Introduction to Database Management Systems Database Administration Transaction Processing Why Concurrency Control? Locking Database Recovery Query Optimization DB Administration 1 Transactions Transaction -- A sequence of operations that is regarded

More information

Exploiting Hardware Transactional Memory in Main-Memory Databases

Exploiting Hardware Transactional Memory in Main-Memory Databases Exploiting Hardware Transactional Memory in Main-Memory Databases Viktor Leis, Alfons Kemper, Thomas Neumann Fakultät für Informatik Technische Universität München Boltzmannstraße 3, D-85748 Garching @in.tum.de

More information

Predictive modeling for software transactional memory

Predictive modeling for software transactional memory VU University Amsterdam BMI Paper Predictive modeling for software transactional memory Author: Tim Stokman Supervisor: Sandjai Bhulai October, Abstract In this paper a new kind of concurrency type named

More information

Increasing the Scalability of a Software Transactional Memory System

Increasing the Scalability of a Software Transactional Memory System Increasing the Scalability of a Software Transactional Memory System Faustino Dabraio da Silva Dissertation submitted to obtain the Master Degree in Information Systems and Computer Engineering Jury Chairman:

More information

Improving In-Memory Database Index Performance with Intel R Transactional Synchronization Extensions

Improving In-Memory Database Index Performance with Intel R Transactional Synchronization Extensions Appears in the 20th International Symposium On High-Performance Computer Architecture, Feb. 15 - Feb. 19, 2014. Improving In-Memory Database Index Performance with Intel R Transactional Synchronization

More information

On the Impact of Dynamic Memory Management on Software Transactional Memory Performance

On the Impact of Dynamic Memory Management on Software Transactional Memory Performance On the Impact of Dynamic Memory Management on Software Transactional Memory Performance Alexandro Baldassin UNESP Univ Estadual Paulista alex@rc.unesp.br Edson Borin Guido Araujo UNICAMP Institute of Computing

More information

ByteSTM: Virtual Machine-level Java Software Transactional Memory

ByteSTM: Virtual Machine-level Java Software Transactional Memory ByteSTM: Virtual Machine-level Java Software Transactional Memory Mohamed Mohamedin ECE Dept., Virginia Tech, Blacksburg, VA, USA mohamedin@vt.edu Binoy Ravindran ECE Dept., Virginia Tech, Blacksburg,

More information

Synchronization Extensions for High-Performance Computing

Synchronization Extensions for High-Performance Computing Performance Evaluation of Intel Transactional R Synchronization Extensions for High-Performance Computing Richard M. Yoo richard.m.yoo@intel.com Konrad Lai konrad.lai@intel.com Christopher J. Hughes christopher.j.hughes@intel.com

More information

Tradeoffs in Transactional Memory Virtualization

Tradeoffs in Transactional Memory Virtualization Tradeoffs in Transactional Memory Virtualization JaeWoong Chung, Chi Cao Minh, Austen McDonald, Travis Skare, Hassan Chafi, Brian D. Carlstrom, Christos Kozyrakis and Kunle Olukotun Computer Systems Laboratory

More information

Concurrency control. Concurrency problems. Database Management System

Concurrency control. Concurrency problems. Database Management System Concurrency control Transactions per second (tps) is the measure of the workload of a operational DBMS; if two transactions access concurrently to the same data there is a problem: the module who resolve

More information

EXAM - 70-518. PRO:Design & Develop Windows Apps Using MS.NET Frmwk 4. Buy Full Product. http://www.examskey.com/70-518.html

EXAM - 70-518. PRO:Design & Develop Windows Apps Using MS.NET Frmwk 4. Buy Full Product. http://www.examskey.com/70-518.html Microsoft EXAM - 70-518 PRO:Design & Develop Windows Apps Using MS.NET Frmwk 4 Buy Full Product http://www.examskey.com/70-518.html Examskey Microsoft 70-518 exam demo product is here for you to test the

More information

Benchmarking Hadoop & HBase on Violin

Benchmarking Hadoop & HBase on Violin Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages

More information

Performance Monitoring of Parallel Scientific Applications

Performance Monitoring of Parallel Scientific Applications Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure

More information

Hardware support for Local Memory Transactions on GPU Architectures

Hardware support for Local Memory Transactions on GPU Architectures Hardware support for Local Memory Transactions on GPU Architectures Alejandro Villegas Rafael Asenjo Ángeles Navarro Oscar Plata Universidad de Málaga, Andalucía Tech. Dept. Computer Architecture, 29071

More information

Tableau Server 7.0 scalability

Tableau Server 7.0 scalability Tableau Server 7.0 scalability February 2012 p2 Executive summary In January 2012, we performed scalability tests on Tableau Server to help our customers plan for large deployments. We tested three different

More information

Carlos Villavieja, Nacho Navarro {cvillavi,nacho}@ac.upc.edu. Arati Baliga, Liviu Iftode {aratib,liviu}@cs.rutgers.edu

Carlos Villavieja, Nacho Navarro {cvillavi,nacho}@ac.upc.edu. Arati Baliga, Liviu Iftode {aratib,liviu}@cs.rutgers.edu Continuous Monitoring using MultiCores Carlos Villavieja, Nacho Navarro {cvillavi,nacho}@ac.upc.edu Arati Baliga, Liviu Iftode {aratib,liviu}@cs.rutgers.edu Motivation Intrusion detection Intruder gets

More information

STM in the Small: Trading Generality for Performance in Software Transactional Memory

STM in the Small: Trading Generality for Performance in Software Transactional Memory STM in the Small: Trading Generality for Performance in Software Transactional Memory Aleksandar Dragojević I&C, EPFL, Lausanne, Switzerland aleksandar.dragojevic@epfl.ch Tim Harris Microsoft Research,

More information

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification Introduction Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification Advanced Topics in Software Engineering 1 Concurrent Programs Characterized by

More information

iservdb The database closest to you IDEAS Institute

iservdb The database closest to you IDEAS Institute iservdb The database closest to you IDEAS Institute 1 Overview 2 Long-term Anticipation iservdb is a relational database SQL compliance and a general purpose database Data is reliable and consistency iservdb

More information

Techniques for Improving the Performance of Software Transactional Memory

Techniques for Improving the Performance of Software Transactional Memory Techniques for Improving the Performance of Software Transactional Memory Srdan Stipić Department of Computer Architecture Universitat Politècnica de Catalunya A thesis submitted for the degree of Doctor

More information

Xeon+FPGA Platform for the Data Center

Xeon+FPGA Platform for the Data Center Xeon+FPGA Platform for the Data Center ISCA/CARL 2015 PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA Accelerator Platform Applications and Eco-system

More information

Big Data Functionality for Oracle 11 / 12 Using High Density Computing and Memory Centric DataBase (MCDB) Frequently Asked Questions

Big Data Functionality for Oracle 11 / 12 Using High Density Computing and Memory Centric DataBase (MCDB) Frequently Asked Questions Big Data Functionality for Oracle 11 / 12 Using High Density Computing and Memory Centric DataBase (MCDB) Frequently Asked Questions Overview: SGI and FedCentric Technologies LLC are pleased to announce

More information

OpenMP Programming on ScaleMP

OpenMP Programming on ScaleMP OpenMP Programming on ScaleMP Dirk Schmidl schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) MPI vs. OpenMP MPI distributed address space explicit message passing typically code redesign

More information

Synchronization Aware Conflict Resolution for Runtime Monitoring Using Transactional Memory

Synchronization Aware Conflict Resolution for Runtime Monitoring Using Transactional Memory Synchronization Aware Conflict Resolution for Runtime Monitoring Using Transactional Memory Chen Tian, Vijay Nagarajan, Rajiv Gupta Dept. of Computer Science and Engineering University of California at

More information

THE VELOX STACK Patrick Marlier (UniNE)

THE VELOX STACK Patrick Marlier (UniNE) THE VELOX STACK Patrick Marlier (UniNE) 06.09.20101 THE VELOX STACK / OUTLINE 2 APPLICATIONS Real applications QuakeTM Game server C code with OpenMP Globulation 2 Real-Time Strategy Game C++ code using

More information

Why Threads Are A Bad Idea (for most purposes)

Why Threads Are A Bad Idea (for most purposes) Why Threads Are A Bad Idea (for most purposes) John Ousterhout Sun Microsystems Laboratories john.ousterhout@eng.sun.com http://www.sunlabs.com/~ouster Introduction Threads: Grew up in OS world (processes).

More information

Database Tuning and Physical Design: Execution of Transactions

Database Tuning and Physical Design: Execution of Transactions Database Tuning and Physical Design: Execution of Transactions David Toman School of Computer Science University of Waterloo Introduction to Databases CS348 David Toman (University of Waterloo) Transaction

More information

ATLAS: SOFTWARE DEVELOPMENT ENVIRONMENT FOR HARDWARE TRANSACTIONAL MEMORY

ATLAS: SOFTWARE DEVELOPMENT ENVIRONMENT FOR HARDWARE TRANSACTIONAL MEMORY ATLAS: SOFTWARE DEVELOPMENT ENVIRONMENT FOR HARDWARE TRANSACTIONAL MEMORY A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY

More information

Adaptive thread scheduling techniques for improving scalability of software transactional memory. Title. Chan, K; Lam, KT; Wang, CL

Adaptive thread scheduling techniques for improving scalability of software transactional memory. Title. Chan, K; Lam, KT; Wang, CL Title Adaptive thread scheduling techniques for improving scalability of software transactional memory Author(s) Chan, K; Lam, KT; Wang, CL Citation The 10th IASTED International Conference on Parallel

More information

Database Concurrency Control and Recovery. Simple database model

Database Concurrency Control and Recovery. Simple database model Database Concurrency Control and Recovery Pessimistic concurrency control Two-phase locking (2PL) and Strict 2PL Timestamp ordering (TSO) and Strict TSO Optimistic concurrency control (OCC) definition

More information

Multi-Threading Performance on Commodity Multi-Core Processors

Multi-Threading Performance on Commodity Multi-Core Processors Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction

More information

Intel TSX (Transactional Synchronization Extensions) Mike Dai Wang and Mihai Burcea

Intel TSX (Transactional Synchronization Extensions) Mike Dai Wang and Mihai Burcea Intel TSX (Transactional Synchronization Extensions) Mike Dai Wang and Mihai Burcea 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Example: toy banking application with RTM Code written and tested in

More information

Accelerating Business Intelligence with Large-Scale System Memory

Accelerating Business Intelligence with Large-Scale System Memory Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness

More information

Chapter 18: Database System Architectures. Centralized Systems

Chapter 18: Database System Architectures. Centralized Systems Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and

More information

SUN ORACLE EXADATA STORAGE SERVER

SUN ORACLE EXADATA STORAGE SERVER SUN ORACLE EXADATA STORAGE SERVER KEY FEATURES AND BENEFITS FEATURES 12 x 3.5 inch SAS or SATA disks 384 GB of Exadata Smart Flash Cache 2 Intel 2.53 Ghz quad-core processors 24 GB memory Dual InfiniBand

More information

Symmetric Multiprocessing

Symmetric Multiprocessing Multicore Computing A multi-core processor is a processing system composed of two or more independent cores. One can describe it as an integrated circuit to which two or more individual processors (called

More information

Database Replication Techniques: a Three Parameter Classification

Database Replication Techniques: a Three Parameter Classification Database Replication Techniques: a Three Parameter Classification Matthias Wiesmann Fernando Pedone André Schiper Bettina Kemme Gustavo Alonso Département de Systèmes de Communication Swiss Federal Institute

More information

Synchronization. Todd C. Mowry CS 740 November 24, 1998. Topics. Locks Barriers

Synchronization. Todd C. Mowry CS 740 November 24, 1998. Topics. Locks Barriers Synchronization Todd C. Mowry CS 740 November 24, 1998 Topics Locks Barriers Types of Synchronization Mutual Exclusion Locks Event Synchronization Global or group-based (barriers) Point-to-point tightly

More information

Geo-Replication in Large-Scale Cloud Computing Applications

Geo-Replication in Large-Scale Cloud Computing Applications Geo-Replication in Large-Scale Cloud Computing Applications Sérgio Garrau Almeida sergio.garrau@ist.utl.pt Instituto Superior Técnico (Advisor: Professor Luís Rodrigues) Abstract. Cloud computing applications

More information

Database Replication with Oracle 11g and MS SQL Server 2008

Database Replication with Oracle 11g and MS SQL Server 2008 Database Replication with Oracle 11g and MS SQL Server 2008 Flavio Bolfing Software and Systems University of Applied Sciences Chur, Switzerland www.hsr.ch/mse Abstract Database replication is used widely

More information

Informatica Ultra Messaging SMX Shared-Memory Transport

Informatica Ultra Messaging SMX Shared-Memory Transport White Paper Informatica Ultra Messaging SMX Shared-Memory Transport Breaking the 100-Nanosecond Latency Barrier with Benchmark-Proven Performance This document contains Confidential, Proprietary and Trade

More information

HTM and NOC - Interdependent Network Traffic Solutions

HTM and NOC - Interdependent Network Traffic Solutions In-Network Traffic Regulation for Transactional Memory Lihang Zhao, Woojin Choi, Lizhong Chen 2, Jeffrey Draper Information Sciences Institute, 2 Ming Hsieh Department of Electrical Engineering University

More information

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820 Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820 This white paper discusses the SQL server workload consolidation capabilities of Dell PowerEdge R820 using Virtualization.

More information

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures Chapter 18: Database System Architectures Centralized Systems! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types! Run on a single computer system and do

More information

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study CS 377: Operating Systems Lecture 25 - Linux Case Study Guest Lecturer: Tim Wood Outline Linux History Design Principles System Overview Process Scheduling Memory Management File Systems A review of what

More information

Concurrency Control. Module 6, Lectures 1 and 2

Concurrency Control. Module 6, Lectures 1 and 2 Concurrency Control Module 6, Lectures 1 and 2 The controlling intelligence understands its own nature, and what it does, and whereon it works. -- Marcus Aurelius Antoninus, 121-180 A. D. Database Management

More information

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Eddie Dong, Yunhong Jiang 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

Definition of SOA. Capgemini University Technology Services School. 2006 Capgemini - All rights reserved November 2006 SOA for Software Architects/ 2

Definition of SOA. Capgemini University Technology Services School. 2006 Capgemini - All rights reserved November 2006 SOA for Software Architects/ 2 Gastcollege BPM Definition of SOA Services architecture is a specific approach of organizing the business and its IT support to reduce cost, deliver faster & better and leverage the value of IT. November

More information

Multi-Tenant Scalability Guidance for Exchange Server 2010 Service Pack 2

Multi-Tenant Scalability Guidance for Exchange Server 2010 Service Pack 2 Multi-Tenant Scalability Guidance for Exchange Server 2010 Service Pack 2 Customer Advisory Team Exchange Product Group Microsoft Corporation December 2011 Contents Introduction... 3 Scalability Guidance...

More information

PostgreSQL Concurrency Issues

PostgreSQL Concurrency Issues PostgreSQL Concurrency Issues 1 PostgreSQL Concurrency Issues Tom Lane Red Hat Database Group Red Hat, Inc. PostgreSQL Concurrency Issues 2 Introduction What I want to tell you about today: How PostgreSQL

More information

Amazon EC2 XenApp Scalability Analysis

Amazon EC2 XenApp Scalability Analysis WHITE PAPER Citrix XenApp Amazon EC2 XenApp Scalability Analysis www.citrix.com Table of Contents Introduction...3 Results Summary...3 Detailed Results...4 Methods of Determining Results...4 Amazon EC2

More information

A Fast Inter-Kernel Communication and Synchronization Layer for MetalSVM

A Fast Inter-Kernel Communication and Synchronization Layer for MetalSVM A Fast Inter-Kernel Communication and Synchronization Layer for MetalSVM Pablo Reble, Stefan Lankes, Carsten Clauss, Thomas Bemmerl Chair for Operating Systems, RWTH Aachen University Kopernikusstr. 16,

More information

SWARM: A Parallel Programming Framework for Multicore Processors. David A. Bader, Varun N. Kanade and Kamesh Madduri

SWARM: A Parallel Programming Framework for Multicore Processors. David A. Bader, Varun N. Kanade and Kamesh Madduri SWARM: A Parallel Programming Framework for Multicore Processors David A. Bader, Varun N. Kanade and Kamesh Madduri Our Contributions SWARM: SoftWare and Algorithms for Running on Multicore, a portable

More information

Parallel Computing 37 (2011) 26 41. Contents lists available at ScienceDirect. Parallel Computing. journal homepage: www.elsevier.

Parallel Computing 37 (2011) 26 41. Contents lists available at ScienceDirect. Parallel Computing. journal homepage: www.elsevier. Parallel Computing 37 (2011) 26 41 Contents lists available at ScienceDirect Parallel Computing journal homepage: www.elsevier.com/locate/parco Architectural support for thread communications in multi-core

More information

Microkernels & Database OSs. Recovery Management in QuickSilver. DB folks: Stonebraker81. Very different philosophies

Microkernels & Database OSs. Recovery Management in QuickSilver. DB folks: Stonebraker81. Very different philosophies Microkernels & Database OSs Recovery Management in QuickSilver. Haskin88: Roger Haskin, Yoni Malachi, Wayne Sawdon, Gregory Chan, ACM Trans. On Computer Systems, vol 6, no 1, Feb 1988. Stonebraker81 OS/FS

More information

Title Release Notes PC SDK 5.14.03. Date 2012-03-30. Dealt with by, telephone. Table of Content GENERAL... 3. Corrected Issues 5.14.03 PDD...

Title Release Notes PC SDK 5.14.03. Date 2012-03-30. Dealt with by, telephone. Table of Content GENERAL... 3. Corrected Issues 5.14.03 PDD... 1/15 Table of Content GENERAL... 3 Release Information... 3 Introduction... 3 Installation... 4 Hardware and Software requirements... 5 Deployment... 6 Compatibility... 7 Updates in PC SDK 5.14.03 vs.

More information

Performance Benchmark for Cloud Block Storage

Performance Benchmark for Cloud Block Storage Performance Benchmark for Cloud Block Storage J.R. Arredondo vjune2013 Contents Fundamentals of performance in block storage Description of the Performance Benchmark test Cost of performance comparison

More information

Virtual Machine Instance Scheduling in IaaS Clouds

Virtual Machine Instance Scheduling in IaaS Clouds Virtual Machine Instance Scheduling in IaaS Clouds Naylor G. Bachiega, Henrique P. Martins, Roberta Spolon, Marcos A. Cavenaghi Departamento de Ciência da Computação UNESP - Univ Estadual Paulista Bauru,

More information

Configuring Apache Derby for Performance and Durability Olav Sandstå

Configuring Apache Derby for Performance and Durability Olav Sandstå Configuring Apache Derby for Performance and Durability Olav Sandstå Database Technology Group Sun Microsystems Trondheim, Norway Overview Background > Transactions, Failure Classes, Derby Architecture

More information

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc.

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc. Portable Scale-Out Benchmarks for MySQL MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc. Continuent 2008 Agenda / Introductions / Scale-Out Review / Bristlecone Performance Testing Tools /

More information

CHAPTER 6: DISTRIBUTED FILE SYSTEMS

CHAPTER 6: DISTRIBUTED FILE SYSTEMS CHAPTER 6: DISTRIBUTED FILE SYSTEMS Chapter outline DFS design and implementation issues: system structure, access, and sharing semantics Transaction and concurrency control: serializability and concurrency

More information

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server) Scalability Results Select the right hardware configuration for your organization to optimize performance Table of Contents Introduction... 1 Scalability... 2 Definition... 2 CPU and Memory Usage... 2

More information

CloudRank-D:A Benchmark Suite for Private Cloud Systems

CloudRank-D:A Benchmark Suite for Private Cloud Systems CloudRank-D:A Benchmark Suite for Private Cloud Systems Jing Quan Institute of Computing Technology, Chinese Academy of Sciences and University of Science and Technology of China HVC tutorial in conjunction

More information

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory) WHITE PAPER Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Abstract... 3 What Is Big Data?...

More information

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,

More information

Atomicity for Concurrent Programs Outsourcing Report. By Khilan Gudka <kg103@doc.ic.ac.uk> Supervisor: Susan Eisenbach

Atomicity for Concurrent Programs Outsourcing Report. By Khilan Gudka <kg103@doc.ic.ac.uk> Supervisor: Susan Eisenbach Atomicity for Concurrent Programs Outsourcing Report By Khilan Gudka Supervisor: Susan Eisenbach June 23, 2007 2 Contents 1 Introduction 5 1.1 The subtleties of concurrent programming.......................

More information

MOMENTUM - A MEMORY-HARD PROOF-OF-WORK VIA FINDING BIRTHDAY COLLISIONS. DANIEL LARIMER dlarimer@invictus-innovations.com Invictus Innovations, Inc

MOMENTUM - A MEMORY-HARD PROOF-OF-WORK VIA FINDING BIRTHDAY COLLISIONS. DANIEL LARIMER dlarimer@invictus-innovations.com Invictus Innovations, Inc MOMENTUM - A MEMORY-HARD PROOF-OF-WORK VIA FINDING BIRTHDAY COLLISIONS DANIEL LARIMER dlarimer@invictus-innovations.com Invictus Innovations, Inc ABSTRACT. We introduce the concept of memory-hard proof-of-work

More information

Server Software Installation Guide

Server Software Installation Guide Server Software Installation Guide This guide provides information on...... The architecture model for GO!Enterprise MDM system setup... Hardware and supporting software requirements for GO!Enterprise

More information

OPERATING SYSTEMS Internais and Design Principles

OPERATING SYSTEMS Internais and Design Principles OPERATING SYSTEMS Internais and Design Principles FOURTH EDITION William Stallings, Ph.D. Prentice Hall Upper Saddle River, New Jersey 07458 CONTENTS Web Site for Operating Systems: Internais and Design

More information

High-Performance Concurrency Control Mechanisms for Main-Memory Databases

High-Performance Concurrency Control Mechanisms for Main-Memory Databases High-Performance Concurrency Control Mechanisms for Main-Memory Databases Per-Åke Larson 1, Spyros Blanas 2, Cristian Diaconu 1, Craig Freedman 1, Jignesh M. Patel 2, Mike Zwilling 1 Microsoft 1, University

More information

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture White Paper Intel Xeon processor E5 v3 family Intel Xeon Phi coprocessor family Digital Design and Engineering Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture Executive

More information

Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores

Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores Xiangyao Yu MIT CSAIL yxy@csail.mit.edu George Bezerra MIT CSAIL gbezerra@csail.mit.edu Andrew Pavlo Srinivas Devadas

More information

FAWN - a Fast Array of Wimpy Nodes

FAWN - a Fast Array of Wimpy Nodes University of Warsaw January 12, 2011 Outline Introduction 1 Introduction 2 3 4 5 Key issues Introduction Growing CPU vs. I/O gap Contemporary systems must serve millions of users Electricity consumed

More information

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.

More information

Performance Analysis of Web based Applications on Single and Multi Core Servers

Performance Analysis of Web based Applications on Single and Multi Core Servers Performance Analysis of Web based Applications on Single and Multi Core Servers Gitika Khare, Diptikant Pathy, Alpana Rajan, Alok Jain, Anil Rawat Raja Ramanna Centre for Advanced Technology Department

More information

find model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1

find model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1 Monitors Monitor: A tool used to observe the activities on a system. Usage: A system programmer may use a monitor to improve software performance. Find frequently used segments of the software. A systems

More information

Scaling HTM-Supported Database Transactions to Many Cores

Scaling HTM-Supported Database Transactions to Many Cores 1 Scaling HTM-Supported Database Transactions to Many Cores Viktor Leis, Alfons Kemper, and Thomas Neumann Abstract So far, transactional memory although a promising technique suffered from the absence

More information

Performance and scalability of a large OLTP workload

Performance and scalability of a large OLTP workload Performance and scalability of a large OLTP workload ii Performance and scalability of a large OLTP workload Contents Performance and scalability of a large OLTP workload with DB2 9 for System z on Linux..............

More information

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013 F1: A Distributed SQL Database That Scales Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013 What is F1? Distributed relational database Built to replace sharded MySQL back-end of AdWords

More information

Exo-leasing: Escrow Synchronization for Mobile Clients of Commodity Storage Servers

Exo-leasing: Escrow Synchronization for Mobile Clients of Commodity Storage Servers Exo-leasing: Escrow Synchronization for Mobile Clients of Commodity Storage Servers Liuba Shrira 1, Hong Tian 2, and Doug Terry 3 1 Brandeis University 2 Amazon.com 3 Microsoft Research Abstract. Escrow

More information

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General

More information

Scaling Database Performance in Azure

Scaling Database Performance in Azure Scaling Database Performance in Azure Results of Microsoft-funded Testing Q1 2015 2015 2014 ScaleArc. All Rights Reserved. 1 Test Goals and Background Info Test Goals and Setup Test goals Microsoft commissioned

More information