Linux Profiling and Optimization The Black Art of Linux Performance Tuning. Federico Lucifredi Platform Orchestra Director Novell, INC
|
|
|
- Baldwin Douglas
- 10 years ago
- Views:
Transcription
1 Linux Profiling and Optimization The Black Art of Linux Performance Tuning Federico Lucifredi Platform Orchestra Director Novell, INC
2 0 - Rationales
3 System Optimization Rationales What are we optimizing? Blind, pointless or premature optimization is the worst of all evils You must know the gory details Yes, that means the kernel, the CPU and all the rest How are we altering the system to measure it? Heisenberg's law of profiling Different tools leave different prints Tooling criteria availability and stability level of noise introduced multiple tools will be necessary, even desirable how does it fail?... 3
4 System's Influences on Profiling Buffering and Caching Swap Runqueue process states Interrupt count I/O Hardware CPU specifics Memory architecture VM garbage collection algorithms... 4
5 Performance Review Outline Document your investigation You have to prove your case Determine the system's baseline performance Start with a clean-slate, no-load system Determine the actual performance of X How is the system performing with current version of X? Track Down Obvious Bottlenecks running out of...? Establish realistic target metric Optimize only what necessary. Leave aesthetics to a refactoring Time (YOURS) is the other constraining factor 5
6 Testing Methodology Document your investigation (Again!) You will forget 6
7 Testing Methodology Document your investigation (Again!) You will forget Change only one thing at a time You are smart, but not that smart Verify your conjecture you need tangible proof cold hard facts Check again with another tool the one you just used lied to ya Be very patient This is hard, frustrating work. Period. Use a process Others need to know you have not escaped the country (yet) 7
8 Profiling Objectives Don't ever optimize everything You can't conflicting aims. Be clear on purpose Optimization memory, permanent storage, net thruput, speed, etc Debugging memory leaks Testing coverage, load testing Choose a clear objective How many transactions? How little RAM? How many clients? Determine acceptable tradeoffs This is obvious to you, make sure it is to those above you 8
9 Tools Lie Interfering with normal operation You are running on the system you are observing You are instrumenting the OS or the Binary Even hardware emulators have limits - and anything else is worse Common results Incomplete information by nature is misleading Complete information is very obtrusive Incorrect or corrupted information is often produced (sic) Footprint size varies Failure modes often dependent on technology used Out-of-scope data misleading and/or overwhelming Verify your results the odd as well as the actionable 9
10 I Gory Details
11 Hardware Architecture: RAM Speedy storage medium Orders of magnitude faster (or slower?) every application running on a 7 th gen CPU has a bottleneck in RAM fetching whenever pre-fetching fails, things get even worse (instruction optimization) you can hand-tune prefetching (_mm_prefetch), 100 clocks in advance Natural-order access problems paging, swapping and generally all large-ram access scenarios Less is more you have oodles of RAM, but using less of it helps caching choosing cache-friendly algorithms Size buffers to minimize eviction Optimize RAM use for cache architecture not everything needs to be done by hand ->Intel compiler 11
12 Hardware Architecture: The Cache Helping RAM to catch up Different architectures in different generations P4s have L1(data only), L2(data+instructions)(30 times larger, 3 times slower) optional L3 Different caching algorithms make exactly divining the caching behavior hard but there is still hope whenever pre-fetching fails, things get even worse (instruction optimization) you can hand-tune prefetching (_mm_prefetch), 100 clocks in advance Locality could be a problem in this context p4 caches by address! L1 cache 64B lines, 128 lines = 8192 Bytes Alignment also relevant 12
13 Hardware Architecture: The CPU Optimization strategies for x86 families: Completely different in different generations i386 (1985): selecting assembly instructions to hand-craft fast code i486 (1989): different instruction choice, but same approach Pentium (1993): U-V pairing, level 2 cache, MMX instructions sixth generation: minimizing data dependencies to leverage n-way pipelines cache more important than ever SIMD 4:1:1 ordering to leverage multiple execution units Pentium 4: hardware prefetch branching errors data dependencies (micro-ops, leverage pipelines) thruput and latency and you still have hyperthreading to be added in this picture... The Intel compiler lets you define per-arch functions 13
14 Hardware Architecture: More CPU Instructions trickery loop unrolling contrasting objectives keep it small (p4 caches decoded instructions) Important: branch prediction increasing accuracy of branch prediction impacts cache reorder to leverage way processor guesses (most likely case first) remove branching with CMOV, allow processor to go ahead on both paths Optimize for long term predictability, not first execution a missed branch prediction has a large cost (missed opportunity, cache misses, resets) SIMD and MMX Sometimes we can even get to use them The compiler can help a bit check options Pause (new) delayed NOOP that reduces polling to RAM bus speed ( mm_pause) 14
15 Hardware Arch: Slow Instructions Typical approaches: Precomputed table maximize cache hits from table keep it small (or access it cleverly) Latency, thruput concerns find other work to do Fetching top point to optimize Instructions that cannot be executed concurrently Same execution port required The floating point specials: Numeric exceptions denormals integer rounding (x86) vs truncation (c) partial flag stalls and you still have hyperthreading to be added in this picture... 15
16 16 Kernel Details Caching and buffering This is good but you need to account for it in profiling (also: swap) All free memory is used (released as needed) use slabtop to review details behavior can be tuned (/proc/sys/vm/swappiness) I/O is the most painful point adventurous? you can tune this via kernel boot params factors: latency vs thruput fairness (to multiple processes) you can also tune the writing of dirty pages to disk Linux supports many different filesystems different details, valid concern in *very* specific cases typically layout much more important
17 II State of the machine
18 What is in the machine: hwinfo hwinfo too much information add double-dash options great for support purposes, but also to get your bearings syntax: hwinfo [options] example: hwinfo --cpu 18
19 CPU load: top top top gives an overview of processes running on the system A number of statistics provided covering CPU usage: us : user mode sy : system (kernel/) ni : niced id : idle wa : I/O wait (blocking) hi : irq handlers si : softirq handlers Additionally, the following are displayed: > load average : 1,5,15-min load averages > system uptime info > total process counts (total running sleeping stopped zombie) syntax: top [options] 19
20 CPU load: vmstat vmstat Default output is averages Stats relevant to CPU: in : interrupts cs : context switches us : total CPU in user (including nice ) sy : total CPU in system (including irq and softirq) wa : total CPU waiting id : total CPU idle syntax: vmstat [options] [delay [samples]] remember: not all tools go by the same definitions 20
21 CPU: /proc /proc /proc knows everything /proc/interrupts /proc/ioports /proc/iomem /proc/cpuinfo procinfo parses /proc for us syntax: procinfo [options] mpstat useful to sort the data on multicore and multiprocessor systems syntax: mpstat [options] 21
22 CPU: others sar System activity reporter still low overhead oprofile uses hardware counters in modern CPUs high level of detail cache misses branch mispredictions a pain to set up (kernel module, daemon and processing tools) 22
23 RAM: free free reports state of RAM, swap, buffer, caches 'shared' value is obsolete, ignore it /proc /proc knows everything slabtop > /proc/meminfo > /proc/slabinfo kernel slab cache monitor syntax: slabtop [options] 23
24 RAM: others vmstat, top, procinfo, sar some information on RAM here as well 24
25 I/O: vmstat vmstat i/o subsystem statistics > D : all > d : individual disk > p : partition output > bo : blocks written (in prev interval) > bi : blocks read (in prev interval) > wa : CPU time spent waiting for I/O > (IO: cur) : total number of I/O ops currently in progress > (IO: s) : number of seconds spent waiting for I/O to complete syntax: vmstat [options] [delay [samples]] 25
26 I/O: iostat iostat i/o subsystem statistics, but better than vmstat > d : disk only, no CPU > k : KB rather than blocks as units > x : extended I/O stats output > tps : transfers per second > Blk_read/s : disk blocks read per second > Blk_wrtn/s : disk blocks written per second > Blk_read : total blocks read during the delay interval > Blk_wrtn : total blocks written during the delay interval > rrqm/s : the number of reads merged before dispatch to disk > wrqm/s : the number of writes merged before dispatch to disk > r/s : reads issued to the disk per second > w/s : writes issues to the disk per second > rsec/s : disk sectors read per second 26
27 I/O: iostat (continued) iostat output (continued) > wsec/s : disk sectors written per second > rkb/s : KB read from disk per second > wkb/s : KB written to disk per second > avgrq-sz : average sector size of requests > avgqu-sz : average size of disk request queue > await : the average time for a request to be completed (ms) > svctm : average service time (await includes service and queue time) syntax: iostat [options] [delay [samples]] tip: to look at swapping, look at iostat output for relevant disk 27
28 I/O: others sar can also pull I/O duty hdparm great way to corrupt existing partitions also, great way to make sure your disks are being used to their full potential bonnie non-destructive disk benching caching and buffering impacts results in a major way can be worked around by using sufficiently large data sizes 28
29 Network I/O iptraf there are several other tools, but iptraf is considerably ahead somewhat GUI driven (in curses) syntax: iptraf [options] 29
30 III Tracking a specific program
31 The Most Basic Tool: time time time measures the runtime of a program startup to exit Three figures provided real user sys Only user is system-state independent, really syntax: time <program> 31
32 CPU load: top - II Per-process stats PID : pid PR : priority NI : niceness S : Process status (S R Z D T) WCHAN : which I/O op blocked on (if any), aka sleeping function TIME : total (system + user) spent since spent since startup COMMAND : command that started the process #C : last CPU seen executing this process FLAGS : task flags (sched.h) USER : username of running process VIRT : virtual image size RES : resident size SHR : shared mem size... 32
33 CPU load: top - III Per-process stats (continued) %CPU : CPU usage %MEM : Memory usage TIME+ : CPU Time, to the hundredths PPID : PID of Parent Process RUSER : real username of running process UID : user ID GROUP : group name TTY : controlling TTY SWAP : swapped size CODE : code size DATA : (data + stack) size nflt : page fault count ndrt : dirty page count 33
34 System calls: strace strace a few selected options: > c : profile > f, F : follow forks, vforks > e : qualify with expression > p : trace PID > S : sort by (time calls name nothing). default=time > E : add or remove from ENV > v : verbose syntax: strace [options] No instrumentation used on binary, kernel trickery on syscalls 34
35 Library calls: ltrace ltrace a few selected options: > c : profile > o : save output to <file> > p : trace PID > S : follow syscalls too, like strace syntax: strace [options] the -c option is a favorite quick&dirty profiling trick 35
36 Library calls: runtime loader ld.so dynamic linker can register information about its execution environment controlled libs display library search paths reloc display relocation processing files display progress for input file symbols display symbol table processing bindings display information about symbol binding versions display version dependencies all all previous options combined statistics display relocation statistics unused determined unused DSOs example: env LD_DEBUG=statistics LD_DEBUG_OUTPUT=outfile kcalc 36
37 profiling: gprof and gcov gprof the gnu profiler > requires instrumented binaries (compile time) > gprof used to parse output file gcov coverage tool uses the same instrumentation similar process 37
38 process RAM /proc /proc/<pid>/status > VmSize : amount of virtual mem the process is (currently) using > VmLck : amount of locked memory > VmRSS : amount of physical memory currently in use > VmData : data size (virtual), excluding stack > VmStk : size of the process's stack > VmExe : executable memory (virtual), libs excluded > VmLib : size of libraries in use /proc/<pid>/maps use of virtual address space memprof graphical tool to same data 38
39 39 Valgrind valgrind emulates processor to application, providing: > really slow execution! > memcheck > uninitialized memory > leaks checking > VmStk : size of the process's stack > VmExe : executable memory (virtual), libs excluded > VmLib : size of libraries in use addrcheck faster, but less errors caught massif heap profiler helgrind race conditions cachegrind / kcachegrind cache profiler
40 Interprocess communication ipcs information on IPC primitives memory in use > t : time of creation > u : being used or swapped > l : system wide limits for use > p : PIDs of creating/last user process > a : all information 40
41 Tips & Tricks A few odd ones out: /bin/swapoff to test (RAM permitting) w/o swap ps is still king when it comes to processes look at the custom output formats for convenient scripting 41
42 IV VM test Dummies
43 VM Details Garbage collection you need to know the algorithms at play floating garbage stop-the world compacting... considerable differencies between Sun JVM and others (not to mention mono!) different Vms have different profiling APIs different Gcing algorithms or implementations yes, you need to skim the relevant papers Profiling APIs: JVMTI (current) JVMPI (since 1.1) legacy, flaky on hotspot, deprecated in 1.5 JVMDI (since 1.1) legacy, also targeted at removal for
44 VM Tools (my favorites) Jprofiler (commercial) extremely detailed info disables minor garbage collections gathers so much data it can hang itself best tool out there in terms of breadth, and first choice to analyze a leak deadlock-finding tools easier to break, as it is doing so much more Netbeans profiler positively great, hats off to Sun, started using it in beta much less information than jprofiler, but can redefine instrumentation at runtime (!) telemetry has zero footprint, and can be used to identify moment for a snapshot generation information provides great hints for really small leaks (bad signal/noise) extremely stable Roll-your-own 44
45 V - Conclusion
46 What more to say? There are a lot of tools out there, start picking them up and go get your hands dirty! 46
47 Resources: Books Code Optimization: Effective Memory Usage (Kris Karspersky) The Software Optimization Cookbook (Richard Gerber) Optimizing Linux Performance (Philip Ezolt) Linux Debugging and Performance Tuning (Steve Best) Linux Performance Tuning and Capacity Planning (Jason Fink and Matthew Sherer) Self Service Linux (Mark Wilding and Dan Behman) Graphics Programming Black Book (Michael Abrash) Other Kernel Optimization / Tuning (Gerald Pfeifer - Brainshare) 47
48 Any Questions? 48
49 Thanks for coming! Princeton, February
50
Optimizing Linux Performance
Optimizing Linux Performance Why is Performance Important Regular desktop user Not everyone has the latest hardware Waiting for an application to open Application not responding Memory errors Extra kernel
These sub-systems are all highly dependent on each other. Any one of them with high utilization can easily cause problems in the other.
Abstract: The purpose of this document is to describe how to monitor Linux operating systems for performance. This paper examines how to interpret common Linux performance tool output. After collecting
Extreme Linux Performance Monitoring Part II
I. Introducing IO Monitoring Disk IO subsystems are the slowest part of any Linux system. This is due mainly to their distance from the CPU and the fact that disks require the physics to work (rotation
Performance monitoring. in the GNU/Linux environment. Linux is like a wigwam - no Windows, no Gates, Apache inside!
1 Performance monitoring in the GNU/Linux environment Linux is like a wigwam - no Windows, no Gates, Apache inside! 2 1 Post-conditions To be familiar with some performance-tuning options To be able to
Virtual Memory Behavior in Red Hat Linux Advanced Server 2.1
Virtual Memory Behavior in Red Hat Linux Advanced Server 2.1 Bob Matthews Red Hat, Inc. Kernel Development Team Norm Murray Red Hat, Inc. Client Engineering Team This is an explanation of the virtual memory
An Implementation Of Multiprocessor Linux
An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than
Monitoring, Tracing, Debugging (Under Construction)
Monitoring, Tracing, Debugging (Under Construction) I was already tempted to drop this topic from my lecture on operating systems when I found Stephan Siemen's article "Top Speed" in Linux World 10/2003.
More on Pipelining and Pipelines in Real Machines CS 333 Fall 2006 Main Ideas Data Hazards RAW WAR WAW More pipeline stall reduction techniques Branch prediction» static» dynamic bimodal branch prediction
CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study
CS 377: Operating Systems Lecture 25 - Linux Case Study Guest Lecturer: Tim Wood Outline Linux History Design Principles System Overview Process Scheduling Memory Management File Systems A review of what
Understanding Linux on z/vm Steal Time
Understanding Linux on z/vm Steal Time June 2014 Rob van der Heij [email protected] Summary Ever since Linux distributions started to report steal time in various tools, it has been causing
Chapter 3 Operating-System Structures
Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual
Determining the Correct Usage of Swap in Linux * 2.6 Kernels
Technical White Paper LINUX OPERATING SYSTEMS www.novell.com Determining the Correct Usage of Swap in Linux * 2.6 Kernels Determining the Correct Usage of Swap in Linux 2.6 Kernels Table of Contents: 2.....
Facultat d'informàtica de Barcelona Univ. Politècnica de Catalunya. Administració de Sistemes Operatius. System monitoring
Facultat d'informàtica de Barcelona Univ. Politècnica de Catalunya Administració de Sistemes Operatius System monitoring Topics 1. Introduction to OS administration 2. Installation of the OS 3. Users management
ELEC 377. Operating Systems. Week 1 Class 3
Operating Systems Week 1 Class 3 Last Class! Computer System Structure, Controllers! Interrupts & Traps! I/O structure and device queues.! Storage Structure & Caching! Hardware Protection! Dual Mode Operation
Operating System and Process Monitoring Tools
http://www.cse.wustl.edu/~jain/cse567-06/ftp/os_monitors/index.html 1 of 12 Operating System and Process Monitoring Tools Arik Brooks, [email protected] Abstract: Monitoring the performance of operating systems
Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat
Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower The traditional approach better performance Why computers are
CIT 470: Advanced Network and System Administration. Topics. Performance Monitoring. Performance Monitoring
CIT 470: Advanced Network and System Administration Performance Monitoring CIT 470: Advanced Network and System Administration Slide #1 Topics 1. Performance monitoring. 2. Performance tuning. 3. CPU 4.
Linux Tools for Monitoring and Performance. Khalid Baheyeldin November 2009 KWLUG http://2bits.com
Linux Tools for Monitoring and Performance Khalid Baheyeldin November 2009 KWLUG http://2bits.com Agenda Introduction Definitions Tools, with demos Focus on command line, servers, web Exclude GUI tools
Performance Monitoring and Tuning. Liferay Chicago User Group (LCHIUG) James Lefeu 29AUG2013
Performance Monitoring and Tuning Liferay Chicago User Group (LCHIUG) James Lefeu 29AUG2013 Outline I. Definitions II. Architecture III.Requirements and Design IV.JDK Tuning V. Liferay Tuning VI.Profiling
PERFORMANCE TUNING ORACLE RAC ON LINUX
PERFORMANCE TUNING ORACLE RAC ON LINUX By: Edward Whalen Performance Tuning Corporation INTRODUCTION Performance tuning is an integral part of the maintenance and administration of the Oracle database
Chapter 2 System Structures
Chapter 2 System Structures Operating-System Structures Goals: Provide a way to understand an operating systems Services Interface System Components The type of system desired is the basis for choices
CS 3530 Operating Systems. L02 OS Intro Part 1 Dr. Ken Hoganson
CS 3530 Operating Systems L02 OS Intro Part 1 Dr. Ken Hoganson Chapter 1 Basic Concepts of Operating Systems Computer Systems A computer system consists of two basic types of components: Hardware components,
Delivering Quality in Software Performance and Scalability Testing
Delivering Quality in Software Performance and Scalability Testing Abstract Khun Ban, Robert Scott, Kingsum Chow, and Huijun Yan Software and Services Group, Intel Corporation {khun.ban, robert.l.scott,
Multi-core Programming System Overview
Multi-core Programming System Overview Based on slides from Intel Software College and Multi-Core Programming increasing performance through software multi-threading by Shameem Akhter and Jason Roberts,
10.04.2008. Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details
Thomas Fahrig Senior Developer Hypervisor Team Hypervisor Architecture Terminology Goals Basics Details Scheduling Interval External Interrupt Handling Reserves, Weights and Caps Context Switch Waiting
CIT 668: System Architecture. Performance Testing
CIT 668: System Architecture Performance Testing Topics 1. What is performance testing? 2. Performance-testing activities 3. UNIX monitoring tools What is performance testing? Performance testing is a
Sequential Performance Analysis with Callgrind and KCachegrind
Sequential Performance Analysis with Callgrind and KCachegrind 2 nd Parallel Tools Workshop, HLRS, Stuttgart, July 7/8, 2008 Josef Weidendorfer Lehrstuhl für Rechnertechnik und Rechnerorganisation Institut
Debugging A MotoHawk Application using the Application Monitor
CONTROL SYSTEM SOLUTIONS Debugging A MotoHawk Application using the Application Monitor Author(s): New Eagle Consulting 3588 Plymouth Road, #274 Ann Arbor, MI 48105-2603 Phone: +1 (734) 929-4557 Ben Hoffman
TOP(1) Linux User s Manual TOP(1)
NAME top display top CPU processes SYNOPSIS top [ ] [ddelay] [ppid] [q][c][c][s][s][i][niter] [b] DESCRIPTION top provides an ongoing look at processor activity in real time. It displays a listing of the
Introducing the IBM Software Development Kit for PowerLinux
Introducing the IBM Software Development Kit for PowerLinux Wainer S. Moschetta IBM, PowerLinux SDK Team Leader [email protected] 1 2009 IBM Acknowledgments The information in this presentation was created
Operating System Overview. Otto J. Anshus
Operating System Overview Otto J. Anshus A Typical Computer CPU... CPU Memory Chipset I/O bus ROM Keyboard Network A Typical Computer System CPU. CPU Memory Application(s) Operating System ROM OS Apps
Web Application s Performance Testing
Web Application s Performance Testing B. Election Reddy (07305054) Guided by N. L. Sarda April 13, 2008 1 Contents 1 Introduction 4 2 Objectives 4 3 Performance Indicators 5 4 Types of Performance Testing
Example of Standard API
16 Example of Standard API System Call Implementation Typically, a number associated with each system call System call interface maintains a table indexed according to these numbers The system call interface
2 2011 Oracle Corporation Proprietary and Confidential
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
System performance monitoring in RTMT
System performance monitoring in RTMT About performance monitoring in RTMT, page 1 System summary and server status monitoring, page 3 IM and Presence and Cisco Jabber summary monitoring, page 6 About
Processes and Non-Preemptive Scheduling. Otto J. Anshus
Processes and Non-Preemptive Scheduling Otto J. Anshus 1 Concurrency and Process Challenge: Physical reality is Concurrent Smart to do concurrent software instead of sequential? At least we want to have
Performance Analysis of Android Platform
Performance Analysis of Android Platform Jawad Manzoor EMDC 21-Nov-2010 Table of Contents 1. Introduction... 3 2. Android Architecture... 3 3. Dalvik Virtual Machine... 4 3.1 Architecture of Dalvik VM...
Chapter 3: Operating-System Structures. System Components Operating System Services System Calls System Programs System Structure Virtual Machines
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines Operating System Concepts 3.1 Common System Components
Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems
Lecture Outline Operating Systems Objectives Describe the functions and layers of an operating system List the resources allocated by the operating system and describe the allocation process Explain how
OS Observability Tools
OS Observability Tools Classic tools and their limitations DTrace (Solaris) SystemTAP (Linux) Slide 1 Where we're going with this... Know about OS observation tools See some examples how to use existing
Linux Performance Optimizations for Big Data Environments
Linux Performance Optimizations for Big Data Environments Dominique A. Heger Ph.D. DHTechnologies (Performance, Capacity, Scalability) www.dhtusa.com Data Nubes (Big Data, Hadoop, ML) www.datanubes.com
OS Thread Monitoring for DB2 Server
1 OS Thread Monitoring for DB2 Server Minneapolis March 1st, 2011 Mathias Hoffmann ITGAIN GmbH [email protected] 2 Mathias Hoffmann Background Senior DB2 Consultant Product Manager for SPEEDGAIN
System Administration
Performance Monitoring For a server, it is crucial to monitor the health of the machine You need not only real time data collection and presentation but offline statistical analysis as well Characteristics
Microkernels, virtualization, exokernels. Tutorial 1 CSC469
Microkernels, virtualization, exokernels Tutorial 1 CSC469 Monolithic kernel vs Microkernel Monolithic OS kernel Application VFS System call User mode What was the main idea? What were the problems? IPC,
Bindel, Spring 2010 Applications of Parallel Computers (CS 5220) Week 1: Wednesday, Jan 27
Logistics Week 1: Wednesday, Jan 27 Because of overcrowding, we will be changing to a new room on Monday (Snee 1120). Accounts on the class cluster (crocus.csuglab.cornell.edu) will be available next week.
White Paper Perceived Performance Tuning a system for what really matters
TMurgent Technologies White Paper Perceived Performance Tuning a system for what really matters September 18, 2003 White Paper: Perceived Performance 1/7 TMurgent Technologies Introduction The purpose
Chapter 3: Operating-System Structures. Common System Components
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System Design and Implementation System Generation 3.1
Release 2.3.4 - February 2005
Release 2.3.4 - February 2005 Linux Performance Monitoring Darren Hoch Services Architect StrongMail Systems, Inc. PUBLISHED BY: Darren Hoch [email protected] http://www.ufsdump.org Copyright 2007
KVM & Memory Management Updates
KVM & Memory Management Updates KVM Forum 2012 Rik van Riel Red Hat, Inc. KVM & Memory Management Updates EPT Accessed & Dirty Bits 1GB hugepages Balloon vs. Transparent Huge Pages Automatic NUMA Placement
Sequential Performance Analysis with Callgrind and KCachegrind
Sequential Performance Analysis with Callgrind and KCachegrind 4 th Parallel Tools Workshop, HLRS, Stuttgart, September 7/8, 2010 Josef Weidendorfer Lehrstuhl für Rechnertechnik und Rechnerorganisation
Increasing XenServer s VM density
Increasing XenServer s VM density Jonathan Davies, XenServer System Performance Lead XenServer Engineering, Citrix Cambridge, UK 24 Oct 2013 Jonathan Davies (Citrix) Increasing XenServer s VM density 24
find model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1
Monitors Monitor: A tool used to observe the activities on a system. Usage: A system programmer may use a monitor to improve software performance. Find frequently used segments of the software. A systems
Help! My system is slow!
Help! My system is slow! Profiling tools, tips and tricks Kris Kennaway [email protected] Overview Goal: Present some tools for evaluating the workload of your FreeBSD system, and identifying the bottleneck(s)
Running Windows on a Mac. Why?
Running Windows on a Mac Why? 1. We still live in a mostly Windows world at work (but that is changing) 2. Because of the abundance of Windows software there are sometimes no valid Mac Equivalents. (Many
Real-time KVM from the ground up
Real-time KVM from the ground up KVM Forum 2015 Rik van Riel Red Hat Real-time KVM What is real time? Hardware pitfalls Realtime preempt Linux kernel patch set KVM & qemu pitfalls KVM configuration Scheduling
Perf Tool: Performance Analysis Tool for Linux
/ Notes on Linux perf tool Intended audience: Those who would like to learn more about Linux perf performance analysis and profiling tool. Used: CPE 631 Advanced Computer Systems and Architectures CPE
Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda
Objectives At the completion of this module, you will be able to: Understand the intended purpose and usage models supported by the VTune Performance Analyzer. Identify hotspots by drilling down through
Introduction. What is an Operating System?
Introduction What is an Operating System? 1 What is an Operating System? 2 Why is an Operating System Needed? 3 How Did They Develop? Historical Approach Affect of Architecture 4 Efficient Utilization
System Structures. Services Interface Structure
System Structures Services Interface Structure Operating system services (1) Operating system services (2) Functions that are helpful to the user User interface Command line interpreter Batch interface
Knut Omang Ifi/Oracle 19 Oct, 2015
Software and hardware support for Network Virtualization Knut Omang Ifi/Oracle 19 Oct, 2015 Motivation Goal: Introduction to challenges in providing fast networking to virtual machines Prerequisites: What
Virtualization. Clothing the Wolf in Wool. Wednesday, April 17, 13
Virtualization Clothing the Wolf in Wool Virtual Machines Began in 1960s with IBM and MIT Project MAC Also called open shop operating systems Present user with the view of a bare machine Execute most instructions
BridgeWays Management Pack for VMware ESX
Bridgeways White Paper: Management Pack for VMware ESX BridgeWays Management Pack for VMware ESX Ensuring smooth virtual operations while maximizing your ROI. Published: July 2009 For the latest information,
Storage Performance Testing
Storage Performance Testing Woody Hutsell, Texas Memory Systems SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes) John Ousterhout Sun Microsystems Laboratories [email protected] http://www.sunlabs.com/~ouster Introduction Threads: Grew up in OS world (processes).
Performance Analysis and Optimization Tool
Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL [email protected] Performance Analysis Team, University of Versailles http://www.maqao.org Introduction Performance Analysis Develop
Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture
Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts
Objectives. Chapter 2: Operating-System Structures. Operating System Services (Cont.) Operating System Services. Operating System Services (Cont.
Objectives To describe the services an operating system provides to users, processes, and other systems To discuss the various ways of structuring an operating system Chapter 2: Operating-System Structures
Whitepaper: performance of SqlBulkCopy
We SOLVE COMPLEX PROBLEMS of DATA MODELING and DEVELOP TOOLS and solutions to let business perform best through data analysis Whitepaper: performance of SqlBulkCopy This whitepaper provides an analysis
JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers
JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers Dave Jaffe, PhD, Dell Inc. Michael Yuan, PhD, JBoss / RedHat June 14th, 2006 JBoss Inc. 2006 About us Dave Jaffe Works for Dell
Tuning Your GlassFish Performance Tips. Deep Singh Enterprise Java Performance Team Sun Microsystems, Inc.
Tuning Your GlassFish Performance Tips Deep Singh Enterprise Java Performance Team Sun Microsystems, Inc. 1 Presentation Goal Learn tips and techniques on how to improve performance of GlassFish Application
Get the Better of Memory Leaks with Valgrind Whitepaper
WHITE PAPER Get the Better of Memory Leaks with Valgrind Whitepaper Memory leaks can cause problems and bugs in software which can be hard to detect. In this article we will discuss techniques and tools
Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.
Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance
20 Command Line Tools to Monitor Linux Performance
20 Command Line Tools to Monitor Linux Performance 20 Command Line Tools to Monitor Linux Performance It s really very tough job for every System or Network administrator to monitor and debug Linux System
KVM: A Hypervisor for All Seasons. Avi Kivity [email protected]
KVM: A Hypervisor for All Seasons Avi Kivity [email protected] November 2007 Virtualization Simulation of computer system in software Components Processor: register state, instructions, exceptions Memory
PERFORMANCE TUNING FOR PEOPLESOFT APPLICATIONS
PERFORMANCE TUNING FOR PEOPLESOFT APPLICATIONS 1.Introduction: It is a widely known fact that 80% of performance problems are a direct result of the to poor performance, such as server configuration, resource
Instrumentation Software Profiling
Instrumentation Software Profiling Software Profiling Instrumentation of a program so that data related to runtime performance (e.g execution time, memory usage) is gathered for one or more pieces of the
Virtualization. Explain how today s virtualization movement is actually a reinvention
Virtualization Learning Objectives Explain how today s virtualization movement is actually a reinvention of the past. Explain how virtualization works. Discuss the technical challenges to virtualization.
Why Relative Share Does Not Work
Why Relative Share Does Not Work Introduction Velocity Software, Inc March 2010 Rob van der Heij rvdheij @ velocitysoftware.com Installations that run their production and development Linux servers on
EE361: Digital Computer Organization Course Syllabus
EE361: Digital Computer Organization Course Syllabus Dr. Mohammad H. Awedh Spring 2014 Course Objectives Simply, a computer is a set of components (Processor, Memory and Storage, Input/Output Devices)
CS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis 1 / 39 Overview Overview Overview What is a Workload? Instruction Workloads Synthetic Workloads Exercisers and
COS 318: Operating Systems
COS 318: Operating Systems File Performance and Reliability Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Topics File buffer cache
Operating System Structures
COP 4610: Introduction to Operating Systems (Spring 2015) Operating System Structures Zhi Wang Florida State University Content Operating system services User interface System calls System programs Operating
Release 2.3.4 - February 2005
Release 2.3.4 - February 2005 Linux System and Performance Monitoring Darren Hoch Director of Professional Services StrongMail Systems, Inc. Linux Performance Monitoring PUBLISHED BY: Darren Hoch StrongMail
Oak Ridge National Laboratory Computing and Computational Sciences Directorate. Lustre Crash Dumps And Log Files
Oak Ridge National Laboratory Computing and Computational Sciences Directorate Lustre Crash Dumps And Log Files Jesse Hanley Rick Mohr Sarp Oral Michael Brim Nathan Grodowitz Gregory Koenig Jason Hill
Technical Properties. Mobile Operating Systems. Overview Concepts of Mobile. Functions Processes. Lecture 11. Memory Management.
Overview Concepts of Mobile Operating Systems Lecture 11 Concepts of Mobile Operating Systems Mobile Business I (WS 2007/08) Prof Dr Kai Rannenberg Chair of Mobile Business and Multilateral Security Johann
Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc()
CS61: Systems Programming and Machine Organization Harvard University, Fall 2009 Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc() Prof. Matt Welsh October 6, 2009 Topics for today Dynamic
Validating Java for Safety-Critical Applications
Validating Java for Safety-Critical Applications Jean-Marie Dautelle * Raytheon Company, Marlborough, MA, 01752 With the real-time extensions, Java can now be used for safety critical systems. It is therefore
ò Paper reading assigned for next Thursday ò Lab 2 due next Friday ò What is cooperative multitasking? ò What is preemptive multitasking?
Housekeeping Paper reading assigned for next Thursday Scheduling Lab 2 due next Friday Don Porter CSE 506 Lecture goals Undergrad review Understand low-level building blocks of a scheduler Understand competing
Holly Cummins IBM Hursley Labs. Java performance not so scary after all
Holly Cummins IBM Hursley Labs Java performance not so scary after all So... You have a performance problem. What next? Goals After this talk you will: Not feel abject terror when confronted with a performance
Practical Performance Understanding the Performance of Your Application
Neil Masson IBM Java Service Technical Lead 25 th September 2012 Practical Performance Understanding the Performance of Your Application 1 WebSphere User Group: Practical Performance Understand the Performance
Effective Java Programming. efficient software development
Effective Java Programming efficient software development Structure efficient software development what is efficiency? development process profiling during development what determines the performance of
Extreme Performance with Java
Extreme Performance with Java QCon NYC - June 2012 Charlie Hunt Architect, Performance Engineering Salesforce.com sfdc_ppt_corp_template_01_01_2012.ppt In a Nutshell What you need to know about a modern
Lecture 17: Virtual Memory II. Goals of virtual memory
Lecture 17: Virtual Memory II Last Lecture: Introduction to virtual memory Today Review and continue virtual memory discussion Lecture 17 1 Goals of virtual memory Make it appear as if each process has:
The System Monitor Handbook. Chris Schlaeger John Tapsell Chris Schlaeger Tobias Koenig
Chris Schlaeger John Tapsell Chris Schlaeger Tobias Koenig 2 Contents 1 Introduction 6 2 Using System Monitor 7 2.1 Getting started........................................ 7 2.2 Process Table.........................................
Multi-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007
Multi-core architectures Jernej Barbic 15-213, Spring 2007 May 3, 2007 1 Single-core computer 2 Single-core CPU chip the single core 3 Multi-core architectures This lecture is about a new trend in computer
Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software
Best Practices for Monitoring Databases on VMware Dean Richards Senior DBA, Confio Software 1 Who Am I? 20+ Years in Oracle & SQL Server DBA and Developer Worked for Oracle Consulting Specialize in Performance
Squeezing The Most Performance from your VMware-based SQL Server
Squeezing The Most Performance from your VMware-based SQL Server PASS Virtualization Virtual Chapter February 13, 2013 David Klee Solutions Architect (@kleegeek) About HoB Founded in 1998 Partner-Focused
