CS 33. Multithreaded Programming V. CS33 Intro to Computer Systems XXXVII 1 Copyright 2015 Thomas W. Doeppner. All rights reserved.
|
|
|
- Kelley Cameron
- 10 years ago
- Views:
Transcription
1 CS 33 Multithreaded Programming V CS33 Intro to Computer Systems XXXVII 1 Copyright 2015 Thomas W. Doeppner. All rights reserved.
2 Alternatives to Mutexes: Atomic Instructions Read-modify-write performed atomically Lock prefix may be used with certain IA32 and x86-64 instructions to make this happen lock incr x lock add %2, x It s expensive It s not portable no POSIX-threads way of doing it Windows supports» InterlockedIncrement» InterlockedDecrement CS33 Intro to Computer Systems XXXVII 2 Copyright 2015 Thomas W. Doeppner. All rights reserved.
3 Alternatives to Mutexes: Spin Locks Consider pthread_mutex_lock(&mutex); x = x+1; pthread_mutex_unlock(&mutex); A lot of overhead is required to put thread to sleep, then wake it up Rather than do that, repeatedly test mutex until it s unlocked, then lock it makes sense only on multiprocessor system CS33 Intro to Computer Systems XXXVII 3 Copyright 2015 Thomas W. Doeppner. All rights reserved.
4 Compare and Exchange cmpxchg src, dest compare contents of %eax with contents of dest» if equal, then dest = src (and ZF = 1)» otherwise %eax = dest (and ZF = 0) CS33 Intro to Computer Systems XXXVII 4 Copyright 2015 Thomas W. Doeppner. All rights reserved.
5 Spin Lock the spin lock is pointed to by the first arg (%edi) locked is 1, unlocked is 0 slock: loop:.text.globl slock, sunlock movl $0, %eax movq $1, %r10 lock cmpxchg %r10, 0(%edi) jne loop ret sunlock: movl $0, 0(%edi) ret CS33 Intro to Computer Systems XXXVII 5 Copyright 2015 Thomas W. Doeppner. All rights reserved.
6 Improved Spin Lock slock: loop:.text.globl slock, sunlock cmp $0, 0(%edi) # compare using normal instructions jne loop movl $0, %eax movq $1, %r10 lock cmpxchg %r10, 0(%edi) # verify w/ cmpxchg jne loop ret sunlock: movl $0, 0(%edi) ret CS33 Intro to Computer Systems XXXVII 6 Copyright 2015 Thomas W. Doeppner. All rights reserved.
7 Yet More From POSIX int pthread_spin_init(pthread_spin_t *s, int pshared); int pthread_spin_destroy(pthread_spin_t *s); int pthread_spin_lock(pthread_spin_t *s); int pthread_spin_trylock(pthread_spin_t *s); int pthread_spin_unlock(pthread_spin_t *s); CS33 Intro to Computer Systems XXXVII 7 Copyright 2015 Thomas W. Doeppner. All rights reserved.
8 Implementing Mutexes (1) Strategy make the usual case (no waiting) very fast can afford to take more time for the other case (waiting for the mutex) CS33 Intro to Computer Systems XXXVII 8 Copyright 2015 Thomas W. Doeppner. All rights reserved.
9 Implementing Mutexes (2) Mutex has three states unlocked locked, no waiters locked, waiting threads Locking the mutex use cmpxchg (with lock prefix) if unlocked, lock it and we re done» state changed to locked, no waiters otherwise, make futex system call to wait till it s unlocked» state changed to locked, waiting threads CS33 Intro to Computer Systems XXXVII 9 Copyright 2015 Thomas W. Doeppner. All rights reserved.
10 Implementing Mutexes (3) Unlocking the mutex if locked, but no waiters» state changed to unlocked if locked, but waiting threads» futex system call made to wake up a waiting thread CS33 Intro to Computer Systems XXXVII 10 Copyright 2015 Thomas W. Doeppner. All rights reserved.
11 Memory Allocation Multiple threads One heap Bottleneck? CS33 Intro to Computer Systems XXXVII 11 Copyright 2015 Thomas W. Doeppner. All rights reserved.
12 Solution 1 Divvy up the heap among the threads each thread has its own heap no mutexes required no bottleneck How much heap does each thread get? CS33 Intro to Computer Systems XXXVII 12 Copyright 2015 Thomas W. Doeppner. All rights reserved.
13 Solution 2 Multiple arenas each with its own mutex thread allocates from the first one it can find whose mutex was unlocked» if none, then creates new one deallocations go back to original arena CS33 Intro to Computer Systems XXXVII 13 Copyright 2015 Thomas W. Doeppner. All rights reserved.
14 Solution 3 Global heap plus per-thread heaps threads pull storage from global heap freed storage goes to per-thread heap» unless things are imbalanced then thread moves storage back to global heap mutex on only the global heap What if one thread allocates and another frees storage? CS33 Intro to Computer Systems XXXVII 14 Copyright 2015 Thomas W. Doeppner. All rights reserved.
15 Malloc/Free Implementations ptmalloc based on solution 2 in glibc (i.e., used by default) tcmalloc based on solution 3 from Google Which is best? CS33 Intro to Computer Systems XXXVII 15 Copyright 2015 Thomas W. Doeppner. All rights reserved.
16 Test Program const unsigned int N=64, nthreads=32, iters= ; int main() { } void *tfunc(void *); pthread_t thread[nthreads]; for (int i=0; i<nthreads; i++) { } pthread_create(&thread[i], 0, tfunc, (void *)i); pthread_detach(thread[i]); pthread_exit(0); void *tfunc(void *arg) { } long i; for (i=0; i<iters; i++) { } long *p = (long *)malloc(sizeof(long)*((i%n)+1)); free(p); return 0; CS33 Intro to Computer Systems XXXVII 16 Copyright 2015 Thomas W. Doeppner. All rights reserved.
17 Compiling It % gcc -o ptalloc alloc.cc lpthread % gcc -o tcalloc alloc.cc lpthread -ltcmalloc CS33 Intro to Computer Systems XXXVII 17 Copyright 2015 Thomas W. Doeppner. All rights reserved.
18 Running It $ time./ptalloc real 0m5.142s user sys 0m20.501s 0m0.024s $ time./tcalloc real 0m1.889s user sys 0m7.492s 0m0.008s CS33 Intro to Computer Systems XXXVII 18 Copyright 2015 Thomas W. Doeppner. All rights reserved.
19 What s Going On? $ strace c f./ptalloc % time seconds usecs/call calls errors syscall futex $ strace c f./tcalloc % time seconds usecs/call calls errors syscall futex CS33 Intro to Computer Systems XXXVII 19 Copyright 2015 Thomas W. Doeppner. All rights reserved.
20 Test Program 2, part 1 #define N 64 #define npairs 16 #define allocsperiter 1024 const long iters = 8*1024*1024/allocsPerIter; #define BufSize typedef struct buffer { int *buf[bufsize]; unsigned int nextin; unsigned int nextout; sem_t empty; sem_t occupied; pthread_t pthread; pthread_t cthread; } buffer_t; CS33 Intro to Computer Systems XXXVII 20 Copyright 2015 Thomas W. Doeppner. All rights reserved.
21 Test Program 2, part 2 int main() { long i; } buffer_t b[npairs]; for (i=0; i<npairs; i++) { } b[i].nextin = 0; b[i].nextout = 0; sem_init(&b[i].empty, 0, BufSize/allocsPerIter); sem_init(&b[i].occupied, 0, 0); pthread_create(&b[i].pthread, 0, prod, &b[i]); pthread_create(&b[i].cthread, 0, cons, &b[i]); for (i=0; i<npairs; i++) { } pthread_join(b[i].pthread, 0); pthread_join(b[i].cthread, 0); return 0; CS33 Intro to Computer Systems XXXVII 21 Copyright 2015 Thomas W. Doeppner. All rights reserved.
22 Test Program 2, part 3 void *prod(void *arg) { long i, j; } buffer_t *b = (buffer_t *)arg; for (i = 0; i<iters; i++) { } sem_wait(&b->empty); for (j = 0; j<allocsperiter; j++) { } b->buf[b->nextin] = malloc(sizeof(int)*((j%n)+1)); if (++b->nextin >= BufSize) b->nextin = 0; sem_post(&b->occupied); return 0; CS33 Intro to Computer Systems XXXVII 22 Copyright 2015 Thomas W. Doeppner. All rights reserved.
23 Test Program 2, part 4 void *cons(void *arg) { long i, j; } buffer_t *b = (buffer_t *)arg; for (i = 0; i<iters; i++) { } sem_wait(&b->occupied); for (j = 0; j<allocsperiter; j++) { } free(b->buf[b->nextout]); if (++b->nextout >= BufSize) b->nextout = 0; sem_post(&b->empty); return 0; CS33 Intro to Computer Systems XXXVII 23 Copyright 2015 Thomas W. Doeppner. All rights reserved.
24 Running It $ time./ptalloc2 real 0m1.087s user sys 0m3.744s 0m0.204s $ time./tcalloc2 real 0m3.535s user sys 0m11.361s 0m2.112s CS33 Intro to Computer Systems XXXVII 24 Copyright 2015 Thomas W. Doeppner. All rights reserved.
25 What s Going On? $ strace c f./ptalloc2 % time seconds usecs/call calls errors syscall futex $ strace c f./tcalloc2 % time seconds usecs/call calls errors syscall futex CS33 Intro to Computer Systems XXXVII 25 Copyright 2015 Thomas W. Doeppner. All rights reserved.
26 Running it Today... sphere $ time./ptalloc real 0m2.373s user 0m9.152s sys 0m0.008s sphere $ time./tcalloc real user sys 0m4.868s 0m19.444s 0m0.020s CS33 Intro to Computer Systems XXXVII 26 Copyright 2015 Thomas W. Doeppner. All rights reserved.
27 Running it Today... kui $ time./ptalloc real 0m2.787s user 0m11.045s sys 0m0.004s kui $ time./tcalloc real user sys 0m1.701s 0m6.584s 0m0.004s CS33 Intro to Computer Systems XXXVII 27 Copyright 2015 Thomas W. Doeppner. All rights reserved.
28 Running it Today... cslab0a $ time./ptalloc real 0m2.234s user 0m8.468s sys 0m0.000s cslab0a $ time./tcalloc real user sys 0m4.938s 0m19.584s 0m0.000s CS33 Intro to Computer Systems XXXVII 28 Copyright 2015 Thomas W. Doeppner. All rights reserved.
29 What s Going On? On kui: libtcmalloc.so -> libtcmalloc.so On other machines: libtcmalloc.so -> libtcmalloc.so CS33 Intro to Computer Systems XXXVII 29 Copyright 2015 Thomas W. Doeppner. All rights reserved.
30 However... cslab0a $ time./ptalloc2 real 0m0.466s user 0m1.504s sys 0m0.212s cslab0a $ time./tcalloc2 real user sys 0m1.516s 0m5.212s 0m0.328s CS33 Intro to Computer Systems XXXVII 30 Copyright 2015 Thomas W. Doeppner. All rights reserved.
31 You ll Soon Finish CS 33 You might celebrate take another systems course» 32» 138» 166» 167 become a 33 TA CS33 Intro to Computer Systems XXXVII 31 Copyright 2015 Thomas W. Doeppner. All rights reserved.
32 Systems Courses Next Semester CS 32 you ve mastered low-level systems programming now do things at a higher level learn software-engineering techniques using Java, XML, etc. CS 138 you now know how things work on one computer what if you ve got lots of computers? some may have crashed, others may have been taken over by your worst (and brightest) enemy CS 166 liked buffer? you ll really like 166 CS 167/169 still mystified about what the OS does? write your own! CS33 Intro to Computer Systems XXXVII 32 Copyright 2015 Thomas W. Doeppner. All rights reserved.
33 The End Well, not quite Database is due on 12/16. We ll do our best to get everything else back this week. Happy coding and happy holidays! CS33 Intro to Computer Systems XXXVII 33 Copyright 2015 Thomas W. Doeppner. All rights reserved.
Systemnahe Programmierung KU
S C I E N C E P A S S I O N T E C H N O L O G Y Systemnahe Programmierung KU A6,A7 www.iaik.tugraz.at 1. Virtual Memory 2. A7 - Fast-Food Restaurant 2 Course Overview A8, A9 System Programming A7 Thread
Shared Address Space Computing: Programming
Shared Address Space Computing: Programming Alistair Rendell See Chapter 6 or Lin and Synder, Chapter 7 of Grama, Gupta, Karypis and Kumar, and Chapter 8 of Wilkinson and Allen Fork/Join Programming Model
A Comparison Of Shared Memory Parallel Programming Models. Jace A Mogill David Haglin
A Comparison Of Shared Memory Parallel Programming Models Jace A Mogill David Haglin 1 Parallel Programming Gap Not many innovations... Memory semantics unchanged for over 50 years 2010 Multi-Core x86
Threads Scheduling on Linux Operating Systems
Threads Scheduling on Linux Operating Systems Igli Tafa 1, Stavri Thomollari 2, Julian Fejzaj 3 Polytechnic University of Tirana, Faculty of Information Technology 1,2 University of Tirana, Faculty of
Middleware. Peter Marwedel TU Dortmund, Informatik 12 Germany. technische universität dortmund. fakultät für informatik informatik 12
Universität Dortmund 12 Middleware Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: Alexandra Nolte, Gesine Marwedel, 2003 2010 年 11 月 26 日 These slides use Microsoft clip arts. Microsoft copyright
W4118 Operating Systems. Instructor: Junfeng Yang
W4118 Operating Systems Instructor: Junfeng Yang Learning goals of this lecture Different flavors of synchronization primitives and when to use them, in the context of Linux kernel How synchronization
Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z) [email protected] http://www.mzahran.com
CSCI-UA.0201-003 Computer Systems Organization Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z) [email protected] http://www.mzahran.com Some slides adapted (and slightly modified)
Win32 API Emulation on UNIX for Software DSM
Win32 API Emulation on UNIX for Software DSM Agenda: Sven M. Paas, Thomas Bemmerl, Karsten Scholtyssik, RWTH Aachen, Germany http://www.lfbs.rwth-aachen.de/ [email protected] Background Our approach:
Lecture 6: Semaphores and Monitors
HW 2 Due Tuesday 10/18 Lecture 6: Semaphores and Monitors CSE 120: Principles of Operating Systems Alex C. Snoeren Higher-Level Synchronization We looked at using locks to provide mutual exclusion Locks
Tail call elimination. Michel Schinz
Tail call elimination Michel Schinz Tail calls and their elimination Loops in functional languages Several functional programming languages do not have an explicit looping statement. Instead, programmers
Project No. 2: Process Scheduling in Linux Submission due: April 28, 2014, 11:59pm
Project No. 2: Process Scheduling in Linux Submission due: April 28, 2014, 11:59pm PURPOSE Getting familiar with the Linux kernel source code. Understanding process scheduling and how different parameters
The Plan Today... System Calls and API's Basics of OS design Virtual Machines
System Calls + The Plan Today... System Calls and API's Basics of OS design Virtual Machines System Calls System programs interact with the OS (and ultimately hardware) through system calls. Called when
Dynamic Behavior Analysis Using Binary Instrumentation
Dynamic Behavior Analysis Using Binary Instrumentation Jonathan Salwan [email protected] St'Hack Bordeaux France March 27 2015 Keywords: program analysis, DBI, DBA, Pin, concrete execution, symbolic
Multi-core Programming System Overview
Multi-core Programming System Overview Based on slides from Intel Software College and Multi-Core Programming increasing performance through software multi-threading by Shameem Akhter and Jason Roberts,
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY. 6.828 Operating System Engineering: Fall 2005
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.828 Operating System Engineering: Fall 2005 Quiz II Solutions Average 84, median 83, standard deviation
Performance Evaluation and Optimization of A Custom Native Linux Threads Library
Center for Embedded Computer Systems University of California, Irvine Performance Evaluation and Optimization of A Custom Native Linux Threads Library Guantao Liu and Rainer Dömer Technical Report CECS-12-11
CS11 Java. Fall 2014-2015 Lecture 7
CS11 Java Fall 2014-2015 Lecture 7 Today s Topics! All about Java Threads! Some Lab 7 tips Java Threading Recap! A program can use multiple threads to do several things at once " A thread can have local
Last Class: Semaphores
Last Class: Semaphores A semaphore S supports two atomic operations: S Wait(): get a semaphore, wait if busy semaphore S is available. S Signal(): release the semaphore, wake up a process if one is waiting
Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc()
CS61: Systems Programming and Machine Organization Harvard University, Fall 2009 Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc() Prof. Matt Welsh October 6, 2009 Topics for today Dynamic
A Survey of Parallel Processing in Linux
A Survey of Parallel Processing in Linux Kojiro Akasaka Computer Science Department San Jose State University San Jose, CA 95192 408 924 1000 [email protected] ABSTRACT Any kernel with parallel processing
Virtualization Technologies
12 January 2010 Virtualization Technologies Alex Landau ([email protected]) IBM Haifa Research Lab What is virtualization? Virtualization is way to run multiple operating systems and user applications on
Intel TSX (Transactional Synchronization Extensions) Mike Dai Wang and Mihai Burcea
Intel TSX (Transactional Synchronization Extensions) Mike Dai Wang and Mihai Burcea 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Example: toy banking application with RTM Code written and tested in
Topics. Producing Production Quality Software. Concurrent Environments. Why Use Concurrency? Models of concurrency Concurrency in Java
Topics Producing Production Quality Software Models of concurrency Concurrency in Java Lecture 12: Concurrent and Distributed Programming Prof. Arthur P. Goldberg Fall, 2005 2 Why Use Concurrency? Concurrent
Linux Scheduler. Linux Scheduler
or or Affinity Basic Interactive es 1 / 40 Reality... or or Affinity Basic Interactive es The Linux scheduler tries to be very efficient To do that, it uses some complex data structures Some of what it
Machine Programming II: Instruc8ons
Machine Programming II: Instrucons Move instrucons, registers, and operands Complete addressing mode, address computaon (leal) Arithmec operaons (including some x6 6 instrucons) Condion codes Control,
Operating Systems. 05. Threads. Paul Krzyzanowski. Rutgers University. Spring 2015
Operating Systems 05. Threads Paul Krzyzanowski Rutgers University Spring 2015 February 9, 2015 2014-2015 Paul Krzyzanowski 1 Thread of execution Single sequence of instructions Pointed to by the program
Overview. CMSC 330: Organization of Programming Languages. Synchronization Example (Java 1.5) Lock Interface (Java 1.5) ReentrantLock Class (Java 1.
CMSC 330: Organization of Programming Languages Java 1.5, Ruby, and MapReduce Overview Java 1.5 ReentrantLock class Condition interface - await( ), signalall( ) Ruby multithreading Concurrent programming
An Implementation Of Multiprocessor Linux
An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than
Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture
Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts
SYSTEM ecos Embedded Configurable Operating System
BELONGS TO THE CYGNUS SOLUTIONS founded about 1989 initiative connected with an idea of free software ( commercial support for the free software ). Recently merged with RedHat. CYGNUS was also the original
Applying Clang Static Analyzer to Linux Kernel
Applying Clang Static Analyzer to Linux Kernel 2012/6/7 FUJITSU COMPUTER TECHNOLOGIES LIMITED Hiroo MATSUMOTO 管 理 番 号 1154ka1 Copyright 2012 FUJITSU COMPUTER TECHNOLOGIES LIMITED Abstract Now there are
Assembly Language: Function Calls" Jennifer Rexford!
Assembly Language: Function Calls" Jennifer Rexford! 1 Goals of this Lecture" Function call problems:! Calling and returning! Passing parameters! Storing local variables! Handling registers without interference!
For a 64-bit system. I - Presentation Of The Shellcode
#How To Create Your Own Shellcode On Arch Linux? #Author : N3td3v!l #Contact-mail : [email protected] #Website : Nopotm.ir #Spcial tnx to : C0nn3ct0r And All Honest Hackerz and Security Managers I - Presentation
SWARM: A Parallel Programming Framework for Multicore Processors. David A. Bader, Varun N. Kanade and Kamesh Madduri
SWARM: A Parallel Programming Framework for Multicore Processors David A. Bader, Varun N. Kanade and Kamesh Madduri Our Contributions SWARM: SoftWare and Algorithms for Running on Multicore, a portable
CS414 SP 2007 Assignment 1
CS414 SP 2007 Assignment 1 Due Feb. 07 at 11:59pm Submit your assignment using CMS 1. Which of the following should NOT be allowed in user mode? Briefly explain. a) Disable all interrupts. b) Read the
Operating Systems and Networks
recap Operating Systems and Networks How OS manages multiple tasks Virtual memory Brief Linux demo Lecture 04: Introduction to OS-part 3 Behzad Bordbar 47 48 Contents Dual mode API to wrap system calls
Thread Synchronization and the Java Monitor
Articles News Weblogs Buzz Chapters Forums Table of Contents Order the Book Print Email Screen Friendly Version Previous Next Chapter 20 of Inside the Java Virtual Machine Thread Synchronization by Bill
10 Things Every Linux Programmer Should Know Linux Misconceptions in 30 Minutes
10 Things Every Linux Programmer Should Know Linux Misconceptions in 30 Minutes Muli Ben-Yehuda [email protected] IBM Haifa Research Labs Linux Kernel Workshop, March 2004 p.1/14 TOC What this talk is about
Frysk The Systems Monitoring and Debugging Tool. Andrew Cagney
Frysk The Systems Monitoring and Debugging Tool Andrew Cagney Agenda Two Use Cases Motivation Comparison with Existing Free Technologies The Frysk Architecture and GUI Command Line Utilities Current Status
Writing a C-based Client/Server
Working the Socket Writing a C-based Client/Server Consider for a moment having the massive power of different computers all simultaneously trying to compute a problem for you -- and still being legal!
MatrixSSL Developer s Guide
MatrixSSL Developer s Guide This document discusses developing with MatrixSSL. It includes instructions on integrating MatrixSSL into an application and a description of the configurable options for modifying
First-class User Level Threads
First-class User Level Threads based on paper: First-Class User Level Threads by Marsh, Scott, LeBlanc, and Markatos research paper, not merely an implementation report User-level Threads Threads managed
Monitors, Java, Threads and Processes
Monitors, Java, Threads and Processes 185 An object-oriented view of shared memory A semaphore can be seen as a shared object accessible through two methods: wait and signal. The idea behind the concept
An Introduction To Simple Scheduling (Primarily targeted at Arduino Platform)
An Introduction To Simple Scheduling (Primarily targeted at Arduino Platform) I'm late I'm late For a very important date. No time to say "Hello, Goodbye". I'm late, I'm late, I'm late. (White Rabbit in
Korset: Code-based Intrusion Detection for Linux
Problem Korset Theory Implementation Evaluation Epilogue Korset: Code-based Intrusion Detection for Linux Ohad Ben-Cohen Avishai Wool Tel Aviv University Problem Korset Theory Implementation Evaluation
System Calls and Standard I/O
System Calls and Standard I/O Professor Jennifer Rexford http://www.cs.princeton.edu/~jrex 1 Goals of Today s Class System calls o How a user process contacts the Operating System o For advanced services
Virtuozzo Virtualization SDK
Virtuozzo Virtualization SDK Programmer's Guide February 18, 2016 Copyright 1999-2016 Parallels IP Holdings GmbH and its affiliates. All rights reserved. Parallels IP Holdings GmbH Vordergasse 59 8200
LLVMLinux: Embracing the Dragon
LLVMLinux: Embracing the Dragon Presented by: Behan Webster ( lead) Presentation Date: 2014.08.22 Clang/LLVM LLVM is a Toolchain Toolkit (libraries from which compilers and related technologies can be
A Comparison of Task Pools for Dynamic Load Balancing of Irregular Algorithms
A Comparison of Task Pools for Dynamic Load Balancing of Irregular Algorithms Matthias Korch Thomas Rauber Universität Bayreuth Fakultät für Mathematik, Physik und Informatik Fachgruppe Informatik {matthias.korch,
OMPT: OpenMP Tools Application Programming Interfaces for Performance Analysis
OMPT: OpenMP Tools Application Programming Interfaces for Performance Analysis Alexandre Eichenberger, John Mellor-Crummey, Martin Schulz, Michael Wong, Nawal Copty, John DelSignore, Robert Dietrich, Xu
Instruction Set Architecture
CS:APP Chapter 4 Computer Architecture Instruction Set Architecture Randal E. Bryant adapted by Jason Fritts http://csapp.cs.cmu.edu CS:APP2e Hardware Architecture - using Y86 ISA For learning aspects
Free Java textbook available online. Introduction to the Java programming language. Compilation. A simple java program
Free Java textbook available online "Thinking in Java" by Bruce Eckel, 4th edition, 2006, ISBN 0131872486, Pearson Education Introduction to the Java programming language CS 4354 Summer II 2015 The third
Semaphores and Monitors: High-level Synchronization Constructs
Semaphores and Monitors: High-level Synchronization Constructs 1 Synchronization Constructs Synchronization Coordinating execution of multiple threads that share data structures Past few lectures: Locks:
Free Java textbook available online. Introduction to the Java programming language. Compilation. A simple java program
Free Java textbook available online "Thinking in Java" by Bruce Eckel, 4th edition, 2006, ISBN 0131872486, Pearson Education Introduction to the Java programming language CS 4354 Summer II 2014 Jill Seaman
Understanding Hardware Transactional Memory
Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence
Distributed Systems [Fall 2012]
Distributed Systems [Fall 2012] [W4995-2] Lec 6: YFS Lab Introduction and Local Synchronization Primitives Slide acks: Jinyang Li, Dave Andersen, Randy Bryant (http://www.news.cs.nyu.edu/~jinyang/fa10/notes/ds-lec2.ppt,
SSC - Concurrency and Multi-threading Java multithreading programming - Synchronisation (I)
SSC - Concurrency and Multi-threading Java multithreading programming - Synchronisation (I) Shan He School for Computational Science University of Birmingham Module 06-19321: SSC Outline Outline of Topics
Jorix kernel: real-time scheduling
Jorix kernel: real-time scheduling Joris Huizer Kwie Min Wong May 16, 2007 1 Introduction As a specialized part of the kernel, we implemented two real-time scheduling algorithms: RM (rate monotonic) and
8.5. <summary>...26 9. Cppcheck addons...27 9.1. Using Cppcheck addons...27 9.1.1. Where to find some Cppcheck addons...27 9.2.
Cppcheck 1.72 Cppcheck 1.72 Table of Contents 1. Introduction...1 2. Getting started...2 2.1. First test...2 2.2. Checking all files in a folder...2 2.3. Excluding a file or folder from checking...2 2.4.
Embedded Programming in C/C++: Lesson-1: Programming Elements and Programming in C
Embedded Programming in C/C++: Lesson-1: Programming Elements and Programming in C 1 An essential part of any embedded system design Programming 2 Programming in Assembly or HLL Processor and memory-sensitive
3 - Lift with Monitors
3 - Lift with Monitors TSEA81 - Computer Engineering and Real-time Systems This document is released - 2015-11-24 version 1.4 Author - Ola Dahl, Andreas Ehliar Assignment - 3 - Lift with Monitors Introduction
Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters
Interpreters and virtual machines Michel Schinz 2007 03 23 Interpreters Interpreters Why interpreters? An interpreter is a program that executes another program, represented as some kind of data-structure.
Multi-Threading Performance on Commodity Multi-Core Processors
Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction
1 Posix API vs Windows API
1 Posix API vs Windows API 1.1 File I/O Using the Posix API, to open a file, you use open(filename, flags, more optional flags). If the O CREAT flag is passed, the file will be created if it doesnt exist.
Why Alerts Suck and Monitoring Solutions need to become Smarter
An AppDynamics Business White Paper HOW MUCH REVENUE DOES IT GENERATE? Why Alerts Suck and Monitoring Solutions need to become Smarter I have yet to meet anyone in Dev or Ops who likes alerts. I ve also
Tasks Schedule Analysis in RTAI/Linux-GPL
Tasks Schedule Analysis in RTAI/Linux-GPL Claudio Aciti and Nelson Acosta INTIA - Depto de Computación y Sistemas - Facultad de Ciencias Exactas Universidad Nacional del Centro de la Provincia de Buenos
Threads 1. When writing games you need to do more than one thing at once.
Threads 1 Threads Slide 1 When writing games you need to do more than one thing at once. Threads offer a way of automatically allowing more than one thing to happen at the same time. Java has threads as
OMPT and OMPD: OpenMP Tools Application Programming Interfaces for Performance Analysis and Debugging
OMPT and OMPD: OpenMP Tools Application Programming Interfaces for Performance Analysis and Debugging Alexandre Eichenberger, John Mellor-Crummey, Martin Schulz, Nawal Copty, John DelSignore, Robert Dietrich,
Effective Computing with SMP Linux
Effective Computing with SMP Linux Multi-processor systems were once a feature of high-end servers and mainframes, but today, even desktops for personal use have multiple processors. Linux is a popular
Process Scheduling CS 241. February 24, 2012. Copyright University of Illinois CS 241 Staff
Process Scheduling CS 241 February 24, 2012 Copyright University of Illinois CS 241 Staff 1 Announcements Mid-semester feedback survey (linked off web page) MP4 due Friday (not Tuesday) Midterm Next Tuesday,
Introduction. Reading. Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF. CMSC 818Z - S99 (lect 5)
Introduction Reading Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF 1 Programming Assignment Notes Assume that memory is limited don t replicate the board on all nodes Need to provide load
MatrixSSL Porting Guide
MatrixSSL Porting Guide Electronic versions are uncontrolled unless directly accessed from the QA Document Control system. Printed version are uncontrolled except when stamped with VALID COPY in red. External
Embedded Systems. 6. Real-Time Operating Systems
Embedded Systems 6. Real-Time Operating Systems Lothar Thiele 6-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
Using Power to Improve C Programming Education
Using Power to Improve C Programming Education Jonas Skeppstedt Department of Computer Science Lund University Lund, Sweden [email protected] jonasskeppstedt.net jonasskeppstedt.net [email protected]
Java Memory Model: Content
Java Memory Model: Content Memory Models Double Checked Locking Problem Java Memory Model: Happens Before Relation Volatile: in depth 16 March 2012 1 Java Memory Model JMM specifies guarantees given by
Performance Tools for Parallel Java Environments
Performance Tools for Parallel Java Environments Sameer Shende and Allen D. Malony Department of Computer and Information Science, University of Oregon {sameer,malony}@cs.uoregon.edu http://www.cs.uoregon.edu/research/paracomp/tau
! Past few lectures: Ø Locks: provide mutual exclusion Ø Condition variables: provide conditional synchronization
1 Synchronization Constructs! Synchronization Ø Coordinating execution of multiple threads that share data structures Semaphores and Monitors: High-level Synchronization Constructs! Past few lectures:
Introduction to Multicore Programming
Introduction to Multicore Programming Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 4435 - CS 9624 (Moreno Maza) Introduction to Multicore Programming CS 433 - CS 9624 1 /
Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6
Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Winter Term 2008 / 2009 Jun.-Prof. Dr. André Brinkmann [email protected] Universität Paderborn PC² Agenda Multiprocessor and
Kernel Synchronization and Interrupt Handling
Kernel Synchronization and Interrupt Handling Oliver Sengpie, Jan van Esdonk Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultt fr Mathematik, Informatik und Naturwissenschaften Universitt
MPLAB Harmony System Service Libraries Help
MPLAB Harmony System Service Libraries Help MPLAB Harmony Integrated Software Framework v1.08 All rights reserved. This section provides descriptions of the System Service libraries that are available
Using the Intel Inspector XE
Using the Dirk Schmidl [email protected] Rechen- und Kommunikationszentrum (RZ) Race Condition Data Race: the typical OpenMP programming error, when: two or more threads access the same memory
Hacking Techniques & Intrusion Detection. Ali Al-Shemery arabnix [at] gmail
Hacking Techniques & Intrusion Detection Ali Al-Shemery arabnix [at] gmail All materials is licensed under a Creative Commons Share Alike license http://creativecommonsorg/licenses/by-sa/30/ # whoami Ali
Introduction. Application Security. Reasons For Reverse Engineering. This lecture. Java Byte Code
Introduction Application Security Tom Chothia Computer Security, Lecture 16 Compiled code is really just data which can be edit and inspected. By examining low level code protections can be removed and
Grundlagen der Betriebssystemprogrammierung
Grundlagen der Betriebssystemprogrammierung Präsentation A3, A4, A5, A6 21. März 2013 IAIK Grundlagen der Betriebssystemprogrammierung 1 / 73 1 A3 - Function Pointers 2 A4 - C++: The good, the bad and
Operating System Manual. Realtime Communication System for netx. Kernel API Function Reference. www.hilscher.com.
Operating System Manual Realtime Communication System for netx Kernel API Function Reference Language: English www.hilscher.com rcx - Kernel API Function Reference 2 Copyright Information Copyright 2005-2007
Chapter 2 Process, Thread and Scheduling
Chapter 2 Process, Thread and Scheduling Scheduler Class and Priority XIANG Yong [email protected] Outline Scheduling Class and Priority Dispatch Queues & Dispatch Tables Thread Priorities & Scheduling
Whitepaper: performance of SqlBulkCopy
We SOLVE COMPLEX PROBLEMS of DATA MODELING and DEVELOP TOOLS and solutions to let business perform best through data analysis Whitepaper: performance of SqlBulkCopy This whitepaper provides an analysis
X86-64 Architecture Guide
X86-64 Architecture Guide For the code-generation project, we shall expose you to a simplified version of the x86-64 platform. Example Consider the following Decaf program: class Program { int foo(int
Course Development of Programming for General-Purpose Multicore Processors
Course Development of Programming for General-Purpose Multicore Processors Wei Zhang Department of Electrical and Computer Engineering Virginia Commonwealth University Richmond, VA 23284 [email protected]
