Page replacement in Linux 2.4 memory management
|
|
|
- Jocelin Roberts
- 10 years ago
- Views:
Transcription
1 Page replacement in Linux 2.4 memory management Rik van Riel Conectiva Inc. Abstract While the virtual memory management in Linux 2.2 has decent performance for many workloads, it suffers from a number of problems. The first part of this paper contains a description of how the Linux 2.2 VMM works and an analysis of why it has bad behaviour in some situations. The way in which a lot of this behaviour has been fixed in the Linux 2.4 kernel is described in the second part of the paper. Due to Linux 2.4 being in a code freeze period while these improvements were implemented, only known-good solutions have been integrated. A lot of the ideas used are derived from principles used in other operating systems, mostly because we have certainty that they work and a good understanding of why, making them suitable for integration into the Linux codebase during a code freeze. 1 Linux 2.2 memory management The memory management in the Linux 2.2 kernel seems to be focussed on simplicity and low overhead. While this works pretty well in practice for most systems, it has some weak points left and simply falls apart under some scenarios. Memory in Linux is unified, that is all the physical memory is on the same free list and can be allocated to any of the following memory pools on demand. Most of these pools can grow and shrink on demand. Typically most of a system s memory will be allocated to the data pages of processes and the page and buffer caches. The slab cache: this is the kernel s dynamically allocated heap storage. This memory is unswappable, but once all objects within one (usually page-sized) area are unused, that area can be reclaimed. The page cache: this cache is used to cache file data for both mmap() and read() and is indexed by (inode, index) pairs. No dirty data exists in this cache; whenever a program writes to a page, the dirty data is copied to the buffer cache, from where the data is written back to disk. The buffer cache: this cache is indexed by (block device, block number) tuples and is used to cache raw disk devices, inodes, directories and other filesystem metadata. It is also used to perform disk IO on behalf of the page cache and the other caches. For disk reads the pagecache bypasses this cache and for network filesystems it isn t used at all. The inode cache: this cache resides in the slab cache and contains information about cached files in the system. Linux 2.2 cannot shrink this cache, but because of its limited size it does need to reclaim individual entries. The dentry cache: this cache contains directory and name information in a filesystemindependent way and is used to lookup files and directories. This cache is dynamically grown and shrunk on demand. SYSV shared memory: the memory pool containing the SYSV shared memory segments is managed pretty much like the page cache, but has its own infrastructure for doing things. Process mapped virtual memory: this memory is administrated in the process page tables. Processes can have page cache or SYSV shared memory segments mapped, in which case those pages are managed in both the page tables and the data structures used for respectively the page cache or the shared memory code.
2 1.1 Linux 2.2 page replacement The page replacement of Linux 2.2 works as follows. When free memory drops below a certain threshold, the pageout daemon (kswapd) is woken up. The pageout daemon should usually be able to keep enough free memory, but if it isn t, user programs will end up calling the pageout code itself. The main pageout loop is in the function try to free pages, which starts by freeing unused slabs from the kernel memory pool. After that, it calls the following functions in a loop, asking each of them to scan a small part of their part of memory until enough memory has been freed. shrink mmap is a classical clock algorithm, which loops over all physical pages, clearing referenced bits, queueing old dirty pages pages for IO and freeing old clean pages. The main disadvantage it has compared to a clock algorithm, however, is that it isn t able to free pages which are in use by a program or a shared memory segment. Those pages need to be unmapped by swap out first. shm swap scans the SYSV shared memory segments, swapping out those pages that haven t been referenced recently and which aren t mapped into any process. swap out scans the virtual memory of all processes in the system, unmapping pages which haven t been referenced recently, starting swapout IO and placing those pages in the page cache. shrink dcache memory recaims entries from the VFS name cache. This is not directly reusable memory, but as soon as a whole page of these entries gets unused we can reclaim that page. Some balancing between these memory freeing function is achieved by calling them in a loop, starting of by asking each of these functions to scan a little bit of their memory, as each of these funnctions accepts a priority argument which tells them how big a percentage of their memory to scan. If not enough memory is freed in the first loop, the priority is increased and the functions are called again. The idea behind this scheme is that when one memory pool is heavily used, it will not give up its resources lightly and we ll automatically fall through to one of the other memory pools. However, this scheme relies on each of the memory pools to react in a similar way to the priority argument under different load conditions. This doesn t work out in practice because the memory pools just have fundamentally different properties to begin with. 1.2 Problems with the Linux 2.2 page replacement Balancing between evicting pages from the file cache, evicting unused process pages and evicting pages from shm segments. If memory pressure is just right shrink mmap is always successful in freeing cache pages and a process which has been idle for a day is still in memory. This can even happen on a system with a fairly busy filesystem cache, but only with the right phase of moon. Simple NRU[Note] replacement cannot accurately identify the working set versus incidentally accessed pages and can lead to extra page faults. This doesn t hurt noticably for most workloads, but it makes a big difference in some workloads and can be fixed easily, mostly since the LFU replacement used in older Linux kernels is known to work. Due to the simple clock algorithm in shrink mmap, sometimes clean, accessed pages can get evicted before dirty, old pages. With a relatively small file cache that mostly consists of dirty data, eg unpacking a tarball, it is possible for the dirty pages to evict the (clean) metadata buffers that are needed to write the dirty data to disk. A few other corner cases with amusing variations on this theme are bound to exist. The system reacts badly to variable VM load or to load spikes after a period of no VM activity. Since kswapd, the pageout daemon, only scans when the system is low on memory, the system can end up in a state where some pages have referenced bits from the last 5 seconds, while other pages have referenced bits from 20 minutes ago. This means that on a load spike the system has no clue which are the right pages to evict from memory, this can lead to a swapping storm, where the wrong pages are evicted and almost immediately af-
3 terwards faulted back in, leading to the pageout of another random page, etc... Under very heavy loads, NRU replacement of pages simply doesn t cut it. More careful and better balanced pageout eviction and flushing is called for. With the fragility of the Linux 2.2 pageout framework this goal doesn t really seem achievable. The facts that shrink mmap is a simple clock algorithm and relies on other functions to make process-mapped pages freeable makes it fairly unpredictable. Add to that the balancing loop in try to free pages and you get a VM subsystem which is extremely sensitive to minute changes in the code and a fragile beast at its best when it comes to maintenance or (shudder) tweaking. 2 Changes in Linux 2.4 For Linux 2.4 a substantial development effort has gone into things like making the VM subsystem fully fine-grained for SMP systems and supporting machines with more than 1GB of RAM. Changes to the pageout code were done only in the last phase of development and are, because of that, somewhat conservative in nature and only employ known-good methods to deal with the problems that happened in the page replacement of the Linux 2.2 kernel. Before we get to the page replacement changes, however, first a short overview of the other changes in the 2.4 VM: More fine-grained SMP locking. The scalability of the VM subsystem has improved a lot for workloads where multiple CPUs are reading or writing the same file simultaneously; for example web or ftp server workloads. This has no real influence on the page replacement code. Unification of the buffer cache and the page cache. While in Linux 2.2 the page cache used the buffer cache to write back its data, needing an extra copy of the data and doubling memory requirements for some write loads, in Linux 2.4 dirty page cache pages are simply added in both the buffer and the page cache. The system does disk IO directly to and from the page cache page. That the buffer cache is still maintained separately for filesystem metadata and the caching of raw block devices. Note that the cache was already unified for reads in Linux 2.2, Linux 2.4 just completes the unification. Support for systems with up to 64GB of RAM (on x86). The Linux kernel previously had all physical memory directly mapped in the kernel s virtual address space, which limited the amount of supported memory to slightly under 1GB. For Linux 2.4 the kernel also supports additional memory (so called high memory or highmem), which can not be used for kernel data structures but only for page cache and user process memory. To do IO on these pages they are temporarily mapped into kernel virtual memory and the data is copied to or from a bounce buffer in low memory. At the same time the memory zone for ISA DMA (0-16 MB physical address range) has also been split out into a separate page zone. This means larger x86 systems end up with 3 memory zones, which all need their free memory balanced so we can continue allocating kernel data structures and ISA DMA buffers. The memory zones logic is generalised enough to also work for NUMA systems. The SYSV shared memory code has been removed and replaced with a simple memory filesystem which uses the page cache for all its functions. It supports both POSIX SHM and SYSV SHM semantics and can also be used as a swappable memory filesystem (tmpfs). Since the changes to the page replacement code took place after all these changes and in the (one and a half year long) code freeze period of the Linux 2.4 kernel, the changes have been kept fairly conservative. On the other hand, we have tried to fix as many of the Linux 2.2 page replacement problems as possible. Here is a short overview of the page replacement changes: they ll be described in more detail below. Page aging, which was present in the Linux 1.2 and 2.0 kernels and in FreeBSD has been reintroduced into the VM. However, a few small changes have been made to avoid some artifacts of virtual page based aging.
4 To avoid the eviction of wrong pages due to interactions from page aging and page flushing, the page aging and flushing has been separated. There are active and inactive page lists. Page flushing has been optimised to avoid too much interference by writeout IO on the more time-critical disk read IO. Controlled background page aging during periods of little or no VM activity in order to keep the system in a state where it can easily deal with load spikes. Streaming IO is detected; we do early eviction on the pages that have already been used and reward the IO stream with more agressive readahead. 3 Linux 2.4 page replacement changes in detail The development of the page replacement changes in Linux 2.4 has been influenced by two main factors. Firstly the bad behaviours of Linux 2.2 page replacement had to be fixed, using only known-good strategies because the development of Linux 2.4 had already entered the code freeze state. Secondly the page replacement had to be more predictable and easier to understand than Linux 2.2 because tuning the page replacement in Linux 2.2 was deserving of the proverbial label subtle and quick to upset. This means that only VM ideas that are well understood and have little interactions with the rest of the system were integrated. Lots of ideas were taken from other freely available operating systems and literature. 3.1 Page aging Page aging was the first easy step in making the bad border-case behaviour from Linux 2.2 go away, it works reasonably well in Linux 1.2, Linux 2.0 and FreeBSD. Page aging allows us to make a much finer distinction between pages we want to keep in memory and pages we want to swap out than the NRU aging in Linux 2.2. Page aging in these OSes works as follows: for each physical page we keep a counter (called age in Linux, or act count in FreeBSD) that indicates how desirable it is to keep this page in memory. When scanning through memory for pages to evict, we increase the page age (adding a constant) whenever we find that the page was accessed and we decrease the page age (substracting a constant) whenever we find that the page wasn t accessed. When the page age (or act count) reaches zero, the page is a candidate for eviction. However, in some situations the LFU[Note] page aging of Linux 2.0 is known to have too much CPU overhead and adjust to changes in system load too slowly. Furthermore, research[smaragdis, Kaplan, Wilson] has shown that recency of access is a more important criteria for page replacement than frequency. These two problems are solved by doing exponential decline of the page age (divide by two instead of substracting a constant) whenever we find a page that wasn t accessed, resulting in page replacement which is closer to LRU[Note] than LFU. This reduces the CPU overhead of page aging drastically in some cases; however, no noticable change in swap behaviour has been observed. Another artifact comes from the virtual address scanning. In Linux 1.2 and 2.0 the system reduces the page age of a page whenever it sees that the page hasn t been accessed from the page table which it is currently scanning, completely ignoring the fact that the page could have been accessed from other page tables. This can put a severe penalty on heavily shared pages, for example the C library. This problem is fixed by simply not doing downwards aging from the virtual page scans, but only from the physical-page based scanning of the active list. If we encounter pages which are not referenced, present in the page tables but not on the active list, we simply follow the swapout path to add this page to the swap cache and the active list so we ll be able to lower the page age of this page and swap it out as soon as the page age reaches zero. 3.2 Multiple page lists The bad interactions between page aging and page flushing, where referenced clean pages were freed before old dirty pages, is fixed by keeping the pages which are candidates for eviction separated from the
5 pages we want to keep in memory (page age zero vs. nonzero). We separate the pages out by putting them on various page lists and having separate algorithms deal with each list. Pages which are not (yet) candidate for eviction are in process page tables, on the active list or both. Page aging as described above happens on these pages, with the function refill inactive() balancing between scanning the page tables and scanning the active list. When the page age on a page reaches zero, due to a combination of pageout scanning and the page not being actively used, the page is moved to the inactive dirty list. Pages on this list are not mapped in the page tables of any process and are, or can become, reclaimable. Pages on this list are handled by the function page launder(), which flushes the dirty pages to disk and moves the clean pages to the inactive clean list. Unlike the active and inactive dirty lists, the inactive clean list isn t global but per memory zone. The pages on these lists can be immediately reused by the page allocation code and count as free pages. These pages can also still be faulted back into where it came from, since the data is still there. In BSD this would be called the cache queue. 3.3 Dynamically sized inactive list Since we do page aging to select which pages to evict, having a very large statically sized inactive list (like FreeBSD has) doesn t seem to make much sense. In fact, it would cancel out some of the effects of doing the page aging in the first place: why spend much effort selecting which pages to evict[dillon] when you keep as much as 33% of your swappable pages on the inactive list? Why do careful page aging when 33% of your pages end up as candidates for eviction at the same priority and you ve effectively undone the aging for those 33% of pages which are candidates for eviction? On the other hand, having lots of inactive pages to choose from when doing page eviction means you have more chances of avoiding writeout IO or doing better IO clustering. It also gives you more of a buffer to deal with allocations due to page faults, etc. Both a large and a small target size for the inactive page list have their benefits. In Linux 2.4 we have chosen for a middle ground by letting the system dynamically vary the size of the inactive list depending on VM activity, with an artificial upper limit to make sure the system always preserves some aging information. Linux 2.4 keeps a floating average of the amount of pages evicted per second and sets the target for the inactive list and the free list combined to the free target plus this average number of page steals per second. Not only does this second give us enough time to do all kinds of page flushing optimisations, it also is small enough to keep page age distribution within the system intact, allowing us to make good choices on which pages to evict and which pages to keep. 3.4 Optimised page flushing Writing out pages from the inactive dirty list as we encounter them can cause a system to totally destroy read performance because of the extra disk seeks done. A better solution is to delay writeout of dirty pages and let these dirty pages accumulate until we can do better IO clustering so that these pages can be written out to disk with less disk seeks and less interference with read performance. Due to the development of the page replacement changes happening in the code freeze, the system currently has a rather simple implementation of what s present in FreeBSD 4.2. As long as there are enough clean inactive pages around, we keep moving those to the inactive clean list and never bother with syncing out the dirty pages. Note that this catches both clean pages and pages which have been written to disk by the update daemon (which commits filesystem data to disk periodically). This means that under loads where data is seldom written we can avoid writing out dirty inactive pages most of the time, giving us much better latencies in freeing pages and letting streaming reads continue without the disk head moving away to write out data all the time. Only under loads where lots of pages are being dirtied quickly does the system suffer a bit from syncing out dirty data irregularly. Another alternative would have been the strategy used in FreeBSD 4.3, where dirty pages get to stay
6 in the inactive list longer than clean pages but are synced out before the clean pages are exhausted. This strategy gives more consistent pageout IO in FreeBSD during heavy write loads. However, a big factor causing the irregularities in pageout writes using the simpler strategy above may well be caused because of the huge inactive list target in FreeBSD (33It is not at all clear what this more complicated strategy would do when used on the dynamically sized inactive list on Linux 2.4, because of this Linux 2.4 uses the better understood strategy of evicting clean inactive pages first and only after those are gone start syncing the dirty ones. 3.5 Background page aging On many systems the normal operating mode is that after a period of relative activity a sudden load spike comes in and the system has to deal with that as gracefully as possible. Linux 2.2 has the problem that, with the lack of an inactive page list, it is not clear at all which pages should be evicted when a sudden demand for memory kicks in. Linux 2.4 is better in this respect, with the reclaim candidates neatly separated out on the inactive list. However, the inactive list could have any random size the moment VM pressure drops off. We d like get the system in a more predictable state while the VM pressure is low. In order to achieve this, Linux 2.4 does background scanning of the pages, trying to get a sane amount of pages on the inactive list, but without scanning agressively so only truly idle pages will end up on the inactive list and the scanning overhead stays small. read() and write(); mmap()ed files and swap-backed anonymous memory aren t supported yet. 4 Conclusions Since the Linux 2.4 kernel s VM subsystem is still being tuned heavily, it is too early to come with conclusive figures on performance. However, initial results seem to indicate that Linux 2.4 generally has better performance than Linux 2.2 on the same hardware. Reports from users indicate that performance on typical desktop machines has improved a lot, even though the tuning of the new VM has only just begun. Throughput figures for server machines seem to be better too, but that could also be attributed to the fact that the unification of the page cache and the buffer cache is complete. One big difference between the VM in Linux 2.4 and the VM in Linux 2.2 is that the new VM is far less sensitive to subtle changes. While in Linux 2.2 a subtle change in the page flushing logic could upset page replacement, in Linux 2.4 it is possible to tweak the various aspects of the VM with predictable results and little to no side-effects in the rest of the VM. The solid performance and relative insensitivity to subtle changes in the environment can be taken as a sign that the Linux 2.4 VM is not just a set of simple fixes for the problems experienced in Linux 2.2, but also a good base for future development. 3.6 Drop behind Streaming IO doesn t just have readahead, but also its natural complement: drop behind. After the program doing the streaming IO is done with a page, we depress its priority heavily so it will be a prime candidate for eviction. Not only does this protect the working set of running processes from being quickly evicted by streaming IO, but it also prevents the streaming IO from competing with the pageouts and pageins of the other running processes, which reduces the number of disk seeks and allows the streaming IO to proceed at a faster speed. Currently readahead and drop-behind only work for 5 Remaining issues The Linux 2.4 VM mainly contains easy to implement and obvious to verify solutions for some of the known problems Linux 2.2 suffers from. A number of issues are either too subtle to implement during the code freeze or will have too much impact on the code. The complete list of TODO items can be found on the Linux-MM page[linux-mm]; here are the most important ones: Low memory deadlock prevention: with the arrival of journaling and delayed-allocation
7 filesystems it is possible that the system will need to allocate memory in order to free memory; more precisely, to write out data so memory can become freeable. To remove the possibility for deadlock, we need to limit the number of outstanding transactions to a safe number, possibly letting each of the page flushing functions indicate how much memory it may need and doing bookkeeping of these values. Note that the same problem occurs with swap over network. Load control: no matter how good we can get the page replacement code, there will always be a point where the system ends up thrashing to death. Implementing a simple load control system, where processes get suspended in round-robin fashion when the paging load gets too high, can keep the system alive under heavy overload and allow the system to get enough work done to bring itself back to a sane state. RSS limits and guarantees: in some situations it is desirable to control the amount of physical memory a process can consume (the resident set size, or RSS). With the virtual address based page scanning of Linux VM subsystem it is trivial to implement RSS ulimits and minimal RSS guarantees. Both help to protect processes under heavy load and allow the system administrator to better control the use of memory resources. VM balancing: in Linux 2.4, the balancing between the eviction of cache pages, swapbacked anonymous memory and the inode and dentry caches is essentially the same as in Linux 2.2. While this seems to work well for most cases there are some possible scenarios where a few of the caches push the other users out of memory, leading to suboptimal system performance. It may be worthwhile to look into improving the balancing algorithm to achieve better performance in nonstandard situations. Unified readahead: currently readahead and drop-behind only works for read() and write(). Ideally they should work for mmap()ed files and anonymous memory too. Having the same set of algorithms for both read()/write(), mmap() and swap-backed anonymous memory will simplify the code and make performance improvements in the readahead and drop-behind code immediately available to all of the system. 6 Acknowledgements The author would like to thank, in no particular order: Stephen Tweedie, for taking care of memory management in Linux 1.2, 2.0 and 2.2 and also for his help with this paper; Matt Dillon, for taking the time to explain the rationale behind every little piece of the FreeBSD VM; Conectiva Inc, who employ the author to hack the Linux kernel and the wonderful crowd testers from #kernelnewbies[kernelnewbies] and elsewhere who have helped flesh out the bugs in the Linux 2.4 VM. References [de Castro] Rodrigo S. de Castro Linux 2.4 Virtual Memory Overview (2001) net/vm24/ [Dillon] Matthew Dillon Design Elements of the FreeBSD VM System (2000) freebsd_vm.html [Kernelnewbies] Kernelnewbies [Linux-MM] The Linux Memory Management home page [Smaragdis, Kaplan, Wilson] Yannis Smaragdakis, Scott F. Kaplan and Paul R. Wilson EELRU: Simple and Effective Adaptive Page Replacement, SIGMETRICS 99 papers/index.html [Note] Extensive documentation about page replacement algorithms is available practically everywhere. The 3 algorithms discussed in this paper are: NRU: Not Recently Used, we scan through memory and evict every page that wasn t accessed since we last scanned it.
8 LRU: we evict those pages that haven t been accessed for the longest time. LFU: we evict those pages that have been accessed least frequently in recent times.
Virtual Memory Behavior in Red Hat Linux Advanced Server 2.1
Virtual Memory Behavior in Red Hat Linux Advanced Server 2.1 Bob Matthews Red Hat, Inc. Kernel Development Team Norm Murray Red Hat, Inc. Client Engineering Team This is an explanation of the virtual memory
KVM & Memory Management Updates
KVM & Memory Management Updates KVM Forum 2012 Rik van Riel Red Hat, Inc. KVM & Memory Management Updates EPT Accessed & Dirty Bits 1GB hugepages Balloon vs. Transparent Huge Pages Automatic NUMA Placement
Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat
Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower The traditional approach better performance Why computers are
Memory Management under Linux: Issues in Linux VM development
Memory Management under Linux: Issues in Linux VM development Christoph Lameter, Ph.D. Technical Lead, Linux Kernel Software Silicon Graphics Inc. [email protected] 2008-03-12 2008 SGI Sunnyvale, California
Understanding Virtual Memory In Red Hat Enterprise Linux 3
Understanding Virtual Memory In Red Hat Enterprise Linux 3 Norm Murray and Neil Horman Version 1.6 December 13, 2005 Contents 1 Introduction 2 2 Definitions 3 2.1 What Comprises a VM........................
Comparison of Memory Management Systems of BSD, Windows, and Linux
Comparison of Memory Management Systems of BSD, Windows, and Linux Gaurang Khetan Graduate Student, Department of Computer Science, University of Southern California, Los Angeles, CA. [email protected] December
The Classical Architecture. Storage 1 / 36
1 / 36 The Problem Application Data? Filesystem Logical Drive Physical Drive 2 / 36 Requirements There are different classes of requirements: Data Independence application is shielded from physical storage
Determining the Correct Usage of Swap in Linux * 2.6 Kernels
Technical White Paper LINUX OPERATING SYSTEMS www.novell.com Determining the Correct Usage of Swap in Linux * 2.6 Kernels Determining the Correct Usage of Swap in Linux 2.6 Kernels Table of Contents: 2.....
Windows Server Performance Monitoring
Spot server problems before they are noticed The system s really slow today! How often have you heard that? Finding the solution isn t so easy. The obvious questions to ask are why is it running slowly
CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study
CS 377: Operating Systems Lecture 25 - Linux Case Study Guest Lecturer: Tim Wood Outline Linux History Design Principles System Overview Process Scheduling Memory Management File Systems A review of what
PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design
PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General
OPERATING SYSTEM - VIRTUAL MEMORY
OPERATING SYSTEM - VIRTUAL MEMORY http://www.tutorialspoint.com/operating_system/os_virtual_memory.htm Copyright tutorialspoint.com A computer can address more memory than the amount physically installed
These sub-systems are all highly dependent on each other. Any one of them with high utilization can easily cause problems in the other.
Abstract: The purpose of this document is to describe how to monitor Linux operating systems for performance. This paper examines how to interpret common Linux performance tool output. After collecting
Real-Time Scheduling 1 / 39
Real-Time Scheduling 1 / 39 Multiple Real-Time Processes A runs every 30 msec; each time it needs 10 msec of CPU time B runs 25 times/sec for 15 msec C runs 20 times/sec for 5 msec For our equation, A
Where is the memory going? Memory usage in the 2.6 kernel
Where is the memory going? Memory usage in the 2.6 kernel Sep 2006 Andi Kleen, SUSE Labs [email protected] Why save memory Weaker reasons "I ve got 1GB of memory. Why should I care about memory?" Old machines
Virtuoso and Database Scalability
Virtuoso and Database Scalability By Orri Erling Table of Contents Abstract Metrics Results Transaction Throughput Initializing 40 warehouses Serial Read Test Conditions Analysis Working Set Effect of
Bigdata High Availability (HA) Architecture
Bigdata High Availability (HA) Architecture Introduction This whitepaper describes an HA architecture based on a shared nothing design. Each node uses commodity hardware and has its own local resources
Operating Systems, 6 th ed. Test Bank Chapter 7
True / False Questions: Chapter 7 Memory Management 1. T / F In a multiprogramming system, main memory is divided into multiple sections: one for the operating system (resident monitor, kernel) and one
An Implementation Of Multiprocessor Linux
An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than
Linux on z/vm Memory Management
Linux on z/vm Memory Management Rob van der Heij rvdheij @ velocitysoftware.com IBM System z Technical Conference Brussels, 2009 Session LX45 Velocity Software, Inc http://www.velocitysoftware.com/ Copyright
COS 318: Operating Systems
COS 318: Operating Systems File Performance and Reliability Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Topics File buffer cache
Enhancing SQL Server Performance
Enhancing SQL Server Performance Bradley Ball, Jason Strate and Roger Wolter In the ever-evolving data world, improving database performance is a constant challenge for administrators. End user satisfaction
Deploying and Optimizing SQL Server for Virtual Machines
Deploying and Optimizing SQL Server for Virtual Machines Deploying and Optimizing SQL Server for Virtual Machines Much has been written over the years regarding best practices for deploying Microsoft SQL
1 Storage Devices Summary
Chapter 1 Storage Devices Summary Dependability is vital Suitable measures Latency how long to the first bit arrives Bandwidth/throughput how fast does stuff come through after the latency period Obvious
Running a Workflow on a PowerCenter Grid
Running a Workflow on a PowerCenter Grid 2010-2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
How To Write To A Linux Memory Map On A Microsoft Zseries 2.2.2 (Amd64) On A Linux 2.3.2 2.4.2 3.5.2 4.5 (Amd32) (
Understanding Linux Memory Management SHARE 102 Session 9241 Dr. Ulrich Weigand Linux on zseries Development, IBM Lab Böblingen [email protected] Linux Memory Management - A Mystery? What does
ARM Caches: Giving you enough rope... to shoot yourself in the foot. Marc Zyngier <[email protected]> KVM Forum 15
ARM Caches: Giving you enough rope... to shoot yourself in the foot Marc Zyngier KVM Forum 15 1 Caches on ARM: A technical issue? Or a cultural one? From: Paolo Bonzini
Chapter 11 I/O Management and Disk Scheduling
Operatin g Systems: Internals and Design Principle s Chapter 11 I/O Management and Disk Scheduling Seventh Edition By William Stallings Operating Systems: Internals and Design Principles An artifact can
Garbage Collection in the Java HotSpot Virtual Machine
http://www.devx.com Printed from http://www.devx.com/java/article/21977/1954 Garbage Collection in the Java HotSpot Virtual Machine Gain a better understanding of how garbage collection in the Java HotSpot
How To Fix A Fault Fault Fault Management In A Vsphere 5 Vsphe5 Vsphee5 V2.5.5 (Vmfs) Vspheron 5 (Vsphere5) (Vmf5) V
VMware Storage Best Practices Patrick Carmichael Escalation Engineer, Global Support Services. 2011 VMware Inc. All rights reserved Theme Just because you COULD, doesn t mean you SHOULD. Lessons learned
Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment
Technical Paper Moving SAS Applications from a Physical to a Virtual VMware Environment Release Information Content Version: April 2015. Trademarks and Patents SAS Institute Inc., SAS Campus Drive, Cary,
Performance Management in a Virtual Environment. Eric Siebert Author and vexpert. whitepaper
Performance Management in a Virtual Environment Eric Siebert Author and vexpert Performance Management in a Virtual Environment Synopsis Performance is defined as the manner in which or the efficiency
Journaling the Linux ext2fs Filesystem
Journaling the Linux ext2fs Filesystem Stephen C. Tweedie [email protected] Abstract This paper describes a work-in-progress to design and implement a transactional metadata journal for the Linux ext2fs
COS 318: Operating Systems. Virtual Machine Monitors
COS 318: Operating Systems Virtual Machine Monitors Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Introduction Have been around
tmpfs: A Virtual Memory File System
tmpfs: A Virtual Memory File System Peter Snyder Sun Microsystems Inc. 2550 Garcia Avenue Mountain View, CA 94043 ABSTRACT This paper describes tmpfs, a memory-based file system that uses resources and
On Benchmarking Popular File Systems
On Benchmarking Popular File Systems Matti Vanninen James Z. Wang Department of Computer Science Clemson University, Clemson, SC 2963 Emails: {mvannin, jzwang}@cs.clemson.edu Abstract In recent years,
BridgeWays Management Pack for VMware ESX
Bridgeways White Paper: Management Pack for VMware ESX BridgeWays Management Pack for VMware ESX Ensuring smooth virtual operations while maximizing your ROI. Published: July 2009 For the latest information,
Virtualization. Explain how today s virtualization movement is actually a reinvention
Virtualization Learning Objectives Explain how today s virtualization movement is actually a reinvention of the past. Explain how virtualization works. Discuss the technical challenges to virtualization.
Network File System (NFS) Pradipta De [email protected]
Network File System (NFS) Pradipta De [email protected] Today s Topic Network File System Type of Distributed file system NFS protocol NFS cache consistency issue CSE506: Ext Filesystem 2 NFS
In and Out of the PostgreSQL Shared Buffer Cache
In and Out of the PostgreSQL Shared Buffer Cache 2ndQuadrant US 05/21/2010 About this presentation The master source for these slides is http://projects.2ndquadrant.com You can also find a machine-usable
Making Linux Safe for Virtual Machines
Making Linux Safe for Virtual Machines Jeff Dike ([email protected]) Abstract User-mode Linux (UML) 1 is the port of Linux to Linux. It has demonstrated that the Linux system call interface is sufficiently
Kernel Optimizations for KVM. Rik van Riel Senior Software Engineer, Red Hat June 25 2010
Kernel Optimizations for KVM Rik van Riel Senior Software Engineer, Red Hat June 25 2010 Kernel Optimizations for KVM What is virtualization performance? Benefits of developing both guest and host KVM
Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems
1 Any modern computer system will incorporate (at least) two levels of storage: primary storage: typical capacity cost per MB $3. typical access time burst transfer rate?? secondary storage: typical capacity
Monitoring Databases on VMware
Monitoring Databases on VMware Ensure Optimum Performance with the Correct Metrics By Dean Richards, Manager, Sales Engineering Confio Software 4772 Walnut Street, Suite 100 Boulder, CO 80301 www.confio.com
Understanding Linux on z/vm Steal Time
Understanding Linux on z/vm Steal Time June 2014 Rob van der Heij [email protected] Summary Ever since Linux distributions started to report steal time in various tools, it has been causing
Performance Management in the Virtual Data Center, Part II Memory Management
Performance Management in the Virtual Data Center, Part II Memory Management Mark B. Friedman Demand Technology Software, 2013 [email protected] The Vision: Virtualization technology and delivery of
Analyzing IBM i Performance Metrics
WHITE PAPER Analyzing IBM i Performance Metrics The IBM i operating system is very good at supplying system administrators with built-in tools for security, database management, auditing, and journaling.
File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System
CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID
Chapter 11 I/O Management and Disk Scheduling
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization
Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.
Guest lecturer: David Hovemeyer November 15, 2004 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds
Q & A From Hitachi Data Systems WebTech Presentation:
Q & A From Hitachi Data Systems WebTech Presentation: RAID Concepts 1. Is the chunk size the same for all Hitachi Data Systems storage systems, i.e., Adaptable Modular Systems, Network Storage Controller,
Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors
Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors Soltesz, et al (Princeton/Linux-VServer), Eurosys07 Context: Operating System Structure/Organization
MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?
MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM? Ashutosh Shinde Performance Architect [email protected] Validating if the workload generated by the load generating tools is applied
Benchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
Computer Engineering and Systems Group Electrical and Computer Engineering SCMFS: A File System for Storage Class Memory
SCMFS: A File System for Storage Class Memory Xiaojian Wu, Narasimha Reddy Texas A&M University What is SCM? Storage Class Memory Byte-addressable, like DRAM Non-volatile, persistent storage Example: Phase
AIX 5L NFS Client Performance Improvements for Databases on NAS
AIX 5L NFS Client Performance Improvements for Databases on NAS January 17, 26 Diane Flemming, Agustin Mena III {dianefl, mena} @us.ibm.com IBM Corporation Abstract Running a database over a file system
Pattern Insight Clone Detection
Pattern Insight Clone Detection TM The fastest, most effective way to discover all similar code segments What is Clone Detection? Pattern Insight Clone Detection is a powerful pattern discovery technology
Cloud Based Application Architectures using Smart Computing
Cloud Based Application Architectures using Smart Computing How to Use this Guide Joyent Smart Technology represents a sophisticated evolution in cloud computing infrastructure. Most cloud computing products
Response Time Analysis
Response Time Analysis A Pragmatic Approach for Tuning and Optimizing SQL Server Performance By Dean Richards Confio Software 4772 Walnut Street, Suite 100 Boulder, CO 80301 866.CONFIO.1 www.confio.com
THE WINDOWS AZURE PROGRAMMING MODEL
THE WINDOWS AZURE PROGRAMMING MODEL DAVID CHAPPELL OCTOBER 2010 SPONSORED BY MICROSOFT CORPORATION CONTENTS Why Create a New Programming Model?... 3 The Three Rules of the Windows Azure Programming Model...
Practical Online Filesystem Checking and Repair
Practical Online Filesystem Checking and Repair Daniel Phillips Samsung Research America (Silicon Valley) [email protected] 1 2013 SAMSUNG Electronics Co. Why we want online checking: Fsck
Distributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6
Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Winter Term 2008 / 2009 Jun.-Prof. Dr. André Brinkmann [email protected] Universität Paderborn PC² Agenda Multiprocessor and
One of the database administrators
THE ESSENTIAL GUIDE TO Database Monitoring By Michael Otey SPONSORED BY One of the database administrators (DBAs) most important jobs is to keep the database running smoothly, which includes quickly troubleshooting
ProTrack: A Simple Provenance-tracking Filesystem
ProTrack: A Simple Provenance-tracking Filesystem Somak Das Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology [email protected] Abstract Provenance describes a file
WHITE PAPER Guide to 50% Faster VMs No Hardware Required
WHITE PAPER Guide to 50% Faster VMs No Hardware Required Think Faster. Visit us at Condusiv.com GUIDE TO 50% FASTER VMS NO HARDWARE REQUIRED 2 Executive Summary As much as everyone has bought into the
Binary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
Operating System Tutorial
Operating System Tutorial OPERATING SYSTEM TUTORIAL Simply Easy Learning by tutorialspoint.com tutorialspoint.com i ABOUT THE TUTORIAL Operating System Tutorial An operating system (OS) is a collection
O&O Defrag and the Windows Defragmenter: A comparison
O&O Defrag and the Windows Defragmenter: Overview The problem of data fragmentation is inextricably linked with any file system. When you change or delete files on a disk, gaps are created which are later
Operating Systems. Virtual Memory
Operating Systems Virtual Memory Virtual Memory Topics. Memory Hierarchy. Why Virtual Memory. Virtual Memory Issues. Virtual Memory Solutions. Locality of Reference. Virtual Memory with Segmentation. Page
Delivering Quality in Software Performance and Scalability Testing
Delivering Quality in Software Performance and Scalability Testing Abstract Khun Ban, Robert Scott, Kingsum Chow, and Huijun Yan Software and Services Group, Intel Corporation {khun.ban, robert.l.scott,
The Importance of Software License Server Monitoring
The Importance of Software License Server Monitoring NetworkComputer How Shorter Running Jobs Can Help In Optimizing Your Resource Utilization White Paper Introduction Semiconductor companies typically
Performance analysis of a Linux based FTP server
Performance analysis of a Linux based FTP server A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Technology by Anand Srivastava to the Department of Computer Science
Whitepaper: performance of SqlBulkCopy
We SOLVE COMPLEX PROBLEMS of DATA MODELING and DEVELOP TOOLS and solutions to let business perform best through data analysis Whitepaper: performance of SqlBulkCopy This whitepaper provides an analysis
Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2
Using Synology SSD Technology to Enhance System Performance Based on DSM 5.2 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD Cache as Solution...
Response Time Analysis
Response Time Analysis A Pragmatic Approach for Tuning and Optimizing Oracle Database Performance By Dean Richards Confio Software, a member of the SolarWinds family 4772 Walnut Street, Suite 100 Boulder,
The BRU Advantage. Contact BRU Sales. Technical Support. Email: [email protected] T: 480.505.0488 F: 480.505.0493 W: www.tolisgroup.
The BRU Advantage The following information is offered to detail the key differences between the TOLIS Group, Inc. BRU data protection products and those products based on the traditional tar and cpio
A Survey of Shared File Systems
Technical Paper A Survey of Shared File Systems Determining the Best Choice for your Distributed Applications A Survey of Shared File Systems A Survey of Shared File Systems Table of Contents Introduction...
DATABASE. Pervasive PSQL Performance. Key Performance Features of Pervasive PSQL. Pervasive PSQL White Paper
DATABASE Pervasive PSQL Performance Key Performance Features of Pervasive PSQL Pervasive PSQL White Paper June 2008 Table of Contents Introduction... 3 Per f o r m a n c e Ba s i c s: Mo r e Me m o r y,
Operating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University
Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University File Management File management system has traditionally been considered part of the operating system. Applications
Case study with atop: memory leakage
Case study with atop: memory leakage Gerlof Langeveld & Jan Christiaan van Winkel www.atoptool.nl May 2010 This document describes the analysis of a slow system suffering from a process with a memory leakage.
Database Hardware Selection Guidelines
Database Hardware Selection Guidelines BRUCE MOMJIAN Database servers have hardware requirements different from other infrastructure software, specifically unique demands on I/O and memory. This presentation
Chapter 12 File Management
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Roadmap Overview File organisation and Access
Chapter 12 File Management. Roadmap
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Overview Roadmap File organisation and Access
Understanding Memory Resource Management in VMware vsphere 5.0
Understanding Memory Resource Management in VMware vsphere 5.0 Performance Study TECHNICAL WHITE PAPER Table of Contents Overview... 3 Introduction... 3 ESXi Memory Management Overview... 4 Terminology...
WHITE PAPER Optimizing Virtual Platform Disk Performance
WHITE PAPER Optimizing Virtual Platform Disk Performance Think Faster. Visit us at Condusiv.com Optimizing Virtual Platform Disk Performance 1 The intensified demand for IT network efficiency and lower
Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage
Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage White Paper July, 2011 Deploying Citrix XenDesktop on NexentaStor Open Storage Table of Contents The Challenges of VDI Storage
DMS Performance Tuning Guide for SQL Server
DMS Performance Tuning Guide for SQL Server Rev: February 13, 2014 Sitecore CMS 6.5 DMS Performance Tuning Guide for SQL Server A system administrator's guide to optimizing the performance of Sitecore
Maximizing Your Server Memory and Storage Investments with Windows Server 2012 R2
Executive Summary Maximizing Your Server Memory and Storage Investments with Windows Server 2012 R2 October 21, 2014 What s inside Windows Server 2012 fully leverages today s computing, network, and storage
find model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1
Monitors Monitor: A tool used to observe the activities on a system. Usage: A system programmer may use a monitor to improve software performance. Find frequently used segments of the software. A systems
The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000
The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000 Summary: This document describes how to analyze performance on an IBM Storwize V7000. IntelliMagic 2012 Page 1 This
Local and Remote Memory: Memory in a Linux/NUMA System
SGI 2006 Remote and Local Memory 1 Local and Remote Memory: Memory in a Linux/NUMA System Jun 20 th, 2006 by Christoph Lameter, Ph.D. [email protected] 2006 Silicon Graphics, Inc. Memory seems to be
Why Relative Share Does Not Work
Why Relative Share Does Not Work Introduction Velocity Software, Inc March 2010 Rob van der Heij rvdheij @ velocitysoftware.com Installations that run their production and development Linux servers on
Using Synology SSD Technology to Enhance System Performance Synology Inc.
Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...
An Oracle White Paper July 2011. Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide
Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide An Oracle White Paper July 2011 1 Disclaimer The following is intended to outline our general product direction.
COS 318: Operating Systems. Virtual Memory and Address Translation
COS 318: Operating Systems Virtual Memory and Address Translation Today s Topics Midterm Results Virtual Memory Virtualization Protection Address Translation Base and bound Segmentation Paging Translation
Midterm Exam #2 Solutions November 10, 1999 CS162 Operating Systems
Fall 1999 Your Name: SID: University of California, Berkeley College of Engineering Computer Science Division EECS Midterm Exam #2 November 10, 1999 CS162 Operating Systems Anthony D. Joseph Circle the
Scheduling. Yücel Saygın. These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum
Scheduling Yücel Saygın These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum 1 Scheduling Introduction to Scheduling (1) Bursts of CPU usage alternate with periods
VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5
Performance Study VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5 VMware VirtualCenter uses a database to store metadata on the state of a VMware Infrastructure environment.
