Dynamic Storage Allocation: A Survey and Critical Review *
|
|
- Kelley Alyson Turner
- 8 years ago
- Views:
Transcription
1 Dynamic Storage Allocation: A Survey and Critical Review * Paul R. Wilson, Mark S. Johnstone, Michael Neely, and David Boles** Department of Computer Sciences University of Texas at Austin Austin, Texas, 78751, USA (Wilson ] markj I neely@cs, ut exas. edu) Abstract. Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their design and evaluation. We then chronologically survey most of the literature on allocators between 1961 and (Scores of papers are discussed, in varying detail, and over 150 references are given.) We argue that allocator designs have been unduly restricted by an emphasis on mechanism, rather than policy, while the latter is more important; higher-level strategic issues are still more important, but have not been given much attention. Most theoretical analyses and empirical allocator evaluations to date have relied on very strong assumptions of randomness and independence, but real program behavior exhibits important regularities that must be exploited if allocators are to perform well in practice. 1 Introduction and Contents In this survey, we will discuss the design and evaluation of conventional dynamic memory allocators. By "conventional," we mean allocators used for general purpose "heap" storage, where the a program can request a block of memory to store a program object, and free that block at any time. A heap, in this sense, is a pool of memory available for the allocation and deallocation of arbitrary-sized blocks of memory in arbitrary order? An allocated block is typically used to store a program "object," which is some kind of structured data item such as a * This work was supported by the National Science Foundation under grant CCR , and by a gift from Novel], Inc. ** Author's current address: Convex Computer Corporation, Dallas, Texas, USA. (dboles@zeppelin.convex.com) 3 This sense of "heap" is not to be confused with a quite different sense of "heap," meaning a partially ordered tree structure.
2 Pascal record, a C struct, or a C++ object, but not necessarily an object in the sense of object-oriented programming. 4 Throughout this paper, we will assume that while a block is in use by a program, its contents (a data object) cannot be relocated to compact memory (as is done, for example, in copying garbage collectors [Wi195]). This is the usual situation in most implementations of conventional programming systems (such as C, Pascal, Ada, etc.), where the memory manager cannot find and update pointers to program objects when they are moved. 5 The allocator does not examine the data stored in a block, or modify or act on it in any way. The data areas within blocks that are used to hold objects are contiguous and nonoverlapping ranges of (real or virtual) memory. We generally assume that only entire blocks are allocated or freed, and that the allocator is entirely unaware of the type of or values of data stored in a block--it only knows the size requested. Scope of this survey. In most of this survey, we will concentrate on issues of overall memory usage, rather than time costs. We believe that detailed measures of time costs are usually a red herring, because they obscure issues of strategy and policy; we believe that most good strategies can yield good policies that are amenable to efficient implementation. (We believe that it's easier to make a very fast allocator than a very memory-efficient one, using fairly straightforward techniques (Section 3.12). Beyond a certain point, however, the effectiveness of speed optimizations will depend on many of the same subtle issues that determine memory usage.) We will also discuss locality of reference only briefly. Locality of reference is increasingly important, as the differences between CPU speed and main memory (or disk) speeds has grown dramatically, with no sign of stopping. Locality is very poorly understood, however; aside from making a few important general comments, we leave most issues of locality to future research. Except where locality issues are explicitly noted, we assume that the cost of a unit of memory is fixed and uniform. We do not address possible interactions with unusual memory hierarchy schemes such as compressed caching, which may complicate locality issues and interact in other important ways with allocator design [WLM91, Wi191, Dou93]. 4 While this is the typical situation, it is not the only one. The "objects" stored by the allocator need not correspond directly to language-level objects. An example of this is a growable array, represented by a fixed size part that holds a pointer to a variable-sized part. The routine that grows an object might allocate a new, larger variable-sized part, copy the contents of the old variable-sized part into it, and deallocate the old part. We assume that the allocator knows nothing of this, and would view each of these parts as separate and independent objects, even if normal programmers would see a "single" object. 5 It is also true of many garbage-collected systems. In some, insufficient information is available from the compiler and/or programmer to allow safe relocation; this is especially likely in systems where code written in different languages is combined in an application [BW88]. In others, real-time and/or concurrent systems, it difficult to for the garbage collector to relocate data without incurring undue overhead and/or disruptiveness [Wil95].
3 We will not discuss specialized allocators for particular applications where the data representations and allocator designs are intertwined. 6 Allocators for these kinds of systems share many properties with the "conventional" allocators we discuss, but introduce many complicating design choices. In particular, they often allow logically contiguous items to be stored noncontiguously, e.g., in pieces of one or a few fixed sizes, and may allow sharing of parts or (other) forms of data compression. We assume that if any fragmenting or compression of higher-level "objects" happens, it is done above the level of abstraction of the allocator interface, and the allocator is entirely unaware of the relationships between the "objects" (e.g., fragments of higher-level objects) that it manages. Similarly, parallel allocators are not discussed, due to the complexity of the subject. Structure of the Paper. This survey is intended to serve two purposes: as a general reference for techniques in memory allocators, and as a review of the literature in the field, including methodological considerations. Much of the literature review has been separated into a chronological review, in Section 4. This section may be skipped or skimmed if methodology and history are not of interest to the reader, especially on a first reading. However, some potentially significant points are covered only there, or only made sufficiently clear and concrete there, so the serious student of dynamic storage allocation should find it worthwhile. (It may even be of interest to those interested in the history and philosophy of computer science, as documentation of the development of a scientific paradigm, r) The remainder of the current section gives our motivations and goals for the paper, and then frames the central problem of memory allocation--fragmentation-and the general techniques for dealing with it. Section 2 discusses deeper issues in fragmentation, and methodological issues (some of which may be skipped) in studying it. Section 3 presents a fairly traditional taxonomy of known memory allocators, including several not usually covered. It also explains why such mechanism-based taxonomies are very limited, and may obscure more important policy issues. Some of those policy issues are sketched. Section 4 reviews the literature on memory allocation. A major point of this section is that the main stream of allocator research over the last several decades has focused on oversimplified (and unrealistic) models of program behavior, and 6 Examples inlude specialized allocators for chained-block message-buffers (e.g., [Wo165]), "cdr-coded" list-processing systems [BC79], specialized storage for overlapping strings with shared structure, and allocators used to manage disk storage in file systems. 7 We use "paradigm" in roughly the sense of Kuhn [Kuh70], as a "pattern or model" for research. The paradigms we discuss are not as broad in scope as the ones usually discussed by Kuhn, but on our reading, his ideas are intended to apply at a variety of scales. We are not necessarily in agreement with all of Kuhn's ideas, or with some of the extreme and anti-scientific purposes they have been put to by others.
4 that little is actually known about how to design allocators, or what performance to expect. Section 5 concludes by summarizing the major points of the paper, and suggesting avenues for future research. Table of Contents 1 Introduction and Contents... 1 Table of Contents Motivation What an Allocator Must Do Strategies, Placement Policies, and Splitting and Coalescing... 9 Strategy, policy, and mechanism Splitting and coalescing A Closer Look at Fragmentation, and How to Study It Internal and External Fragmentation The Traditional Methodology: Probabilistic Analyses, and Simulation Using Synthetic Traces Random simulations Probabilistic analyses A note on exponentially-distributed random lifetimes A note on Markov models What Fragmentation Really Is, and Why the Traditional Approach is Unsound Fragmentation is caused by isolated deaths Fragmentation is caused by time-varying behavior Implications for experimental methodology Some Real Program Behaviors Ramps, peaks, and plateaus Fragmentation at peaks is important Exploiting ordering and size dependencies Implications for strategy Implications for research Profiles of some real programs Summary Deferred Coalescing and Deferred Reuse Deferred coalescing Deferred reuse A Sound Methodology: Simulation Using Real Traces Tracing and simulation Locality studies A Taxonomy of Allocators Allocator Policy Issues... 37
5 3.2 Some Important Low-Level Mechanisms Header fields and alignment Boundary tags Link fields within blocks Lookup tables SpeciM treatment of small objects Special treatment of the end block of the heap Basic Mechanisms Sequential Fits Discussion of Sequential Fits and General Policy Issues Segregated Free Lists Buddy Systems Indexed Fits Discussion of indexed fits Bitmapped Fits Discussion of Basic Mechanisms Quick Lists and Deferred Coalescing Scheduling of coalescing What to coalesce Discussion A Note on Time Costs A Chronological Review of The Literature The first three decades: 1960 to to to to Recent Studies Using Real Traces Zorn, Grunwald, et al Vo Wilson, Johnstone, Neely, and Boles Summary and Coneluslons Models and Theories Strategies and Policies Mechanisms Experiments Data Challenges and Opportunities
6 1.1 Motivation This paper is motivated by our perception that there is considerable confusion about the nature of memory allocators, and about the problem of memory allocation in general. Worse, this confusion is often unrecognized, and allocators are widely thought to be fairly well understood. In fact, we know little more about allocators than was known twenty years ago, which is not as much as might be expected. The literature on the subject is rather inconsistent and scattered, and considerable work appears to be done using approaches that are quite limited. We will try to sketch a unifying conceptual framework for understanding what is and is not known, and suggest promising approaches for new research. This problem with the allocator literature has considerable practical importance. Aside from the human effort involved in allocator studies per se, there are effects in the real world, both on computer system costs, and on the effort required to create real software. We think it is likely that the widespread use of poor allocators incurs a loss of main and cache memory (and CPU cycles) upwards of of a billion (109) U.S. dollars worldwide--a significant fraction of the world's memory and processor output may be squandered, at huge cost. s Perhaps even worse is the effect on programming style due to the widespread use of allocators that are simply bad--either because better allocators are known but not widely known or understood, or because allocation research has failed to address the proper issues. Many programmers avoid heap allocation in many situations, because of perceived space or time costs. 9 It seems significant to us that many articles in non-refereed publications-- and a number in refereed publications outside the major journals of operating systems and programming languages--are motivated by extreme concerns about the speed or memory costs of general heap allocation. (One such paper [GM85] is discussed in Section 4.1.) Often, ad hoc solutions are used for applications that should not be problematic at all, because at least some well-designed general allocators should do quite well for the workload in question. We suspect that in some cases, the perceptions are wrong, and that the costs of modern heap allocation are simply overestimated. In many cases, however, it appears that poorly-designed or poorly-implemented allocators have lead to a widespread and quite understandable belief that general heap allocation is s This is an unreliable estimate based on admittedly casual last-minute computations, approximately as follows: there are on the order of 100 million PC's in the world. If we assume that they have an average of 10 megabytes of memory at $30 per megabyte, there is 30 billion dollars worth of RAM at stake. (With the expected popularity of Windows 95, this seems like it will soon become a fairly conservative estimate, if it isn't already.) If just one fifth (6 billion dollars worth) is used for heap-allocated data, and one fifth of that is unnecessarily wasted, the cost is over a billion dollars. 9 It is our impression that UNIX programmers' usage of heap allocation went up significantly when Chris Kingsley's allocator was distributed with BSD 4.2 UNIX-- simply because it was much faster than the allocators they'd been accustomed to. Unfortunately, that allocator is somewhat wasteful of space.
7 necessarily expensive. Too many poor allocators have been supplied with widelydistributed operating systems and compilers, and too few practitioners are aware of the alternatives. This appears to be changing, to some degree. Many operating systems now supply fairly good allocators, and there is an increasing trend toward marketing libraries that include general allocators which are at least claimed to be good, as a replacement for default allocators. It seems likely that there is simply a lag between the improvement in allocator technology and its widespread adoption, and another lag before programming style adapts. The combined lag is quite long, however, and we have seen several magazine articles in the last year on how to avoid using a general allocator. Postings praising ad hoc allocation schemes are very common in the Usenet newsgroups oriented toward real-world programming. The slow adoption of better technology and the lag in changes in perceptions may not be the only problems, however. We have our doubts about how well allocators are really known to work, based on a fairly thorough review of the literature. We wonder whether some part of the perception is due to occasional programs that interact pathologically with common allocator designs, in ways that have never been observed by researchers. This does not seem unlikely, because most experiments have used non-representative workloads, which are extremely unlikely to generate the same problematic request patterns as real programs. Sound studies using realistic workloads are too rare. The total number of real, nontrivial programs that have been used for good experiments is very small, apparently less than 20. A significant number of real programs could exhibit problematic behavior patterns that are simply not represented in studies to date. Long-running processes such as operating systems, interactive programming environments, and networked servers may pose special problems that have not been addressed. Most experiments to date have studied programs that execute for a few minutes (at most) on common workstations. Little is known about what happens when programs run for hours, days, weeks or months. It may well be that some seemingly good allocators do not work well in the long run, with their memory efficiency slowly degrading until they perform quite badly. We don't know--and we're fairly sure that nobody knows. Given that long-running processes are often the most important ones, and are increasingly important with the spread of client/server computing, this is a potentially large problem. The worst case performance of any general allocator amounts to complete failure due to memory exhaustion or virtual memory thrashing (Section 1.2). This means that any real allocator may have lurking "bugs" and fail unexpectedly for seemingly reasonable inputs. Such problems may be hidden, because most programmers who encounter severe problems may simply code around them using ad hoc storage management techniques--or, as is still painfully common, by statically allocating "enough" memory for variable-sized structures. These ad-hoc approaches to storage management lead to "brittle" software with hidden limitations (e.g., due to the use
8 of fixed-size arrays). The impact on software clarity, flexibility, maintainability, and reliability is quite important, but difficult to estimate. These hidden costs should not be underestimated, however, because they can lead to major penalties in productivity and to significant human costs in sheer frustration, anxiety, and general suffering. A much larger and broader set of test applications and experiments is needed before we have any assurance that any allocator works reliably--in a crucial performance sense--much less works well. Given this caveat, however, it appears that some allocators are clearly better than others in most cases, and this paper will attempt to explain the differences. 1.2 What an Allocator Must Do An allocator must keep track of which parts of memory are in use, and which parts are free. The goal of allocator design is usually to minimize wasted space without undue time cost, or vice versa. The ideal allocator would spend negligible time managing memory, and waste negligible space. A conventional allocator cannot control the number or size of live blocks-- they are entirely up to the program requesting and releasing the space managed by the allocator. A conventional allocator also cannot compact memory, moving blocks around to make them contiguous and free contiguous memory. It must respond immediately to a request for space, and once it has decided which block of memory to allocate, it cannot change that decision--that block of memory must be regarded as inviolable until the application l~ program chooses to free it. It can only deal with memory that is free, and only choose where in free memory to allocate the next requested block. (Allocators record the locations and sizes of free blocks of memory in some kind of hidden data structure, which may be a linear list, a totally or partially ordered tree, a bitmap, or some hybrid data structure.) An allocator is therefore an online algorithm, which must respond to requests in strict sequence, immediately, and its decisions are irrevocable. The problem the allocator must address is that the application program may free blocks in any order, creating "holes" amid live objects. If these holes are too numerous and small, they cannot be used to satisfy future requests for larger blocks. This problem is known as fragmentation, and it is a potentially disastrous one. For the general case that we have outlined--where the application program may allocate arbitrary-sized objects at arbitrary times and free them at any later time--there is no reliable algorithm for ensuring efficient memory usage, and none is possible. It has been proven that for any possible allocation algorithm, there will always be the possibility that some application program will allocate and deallocate blocks in some fashion that defeats the allocator's strategy, and forces it into severe fragmentation [Rob71, GGU72, Rob74, Rob77]. Not only are 10 We use the term "application" rather generally; the "application" for which an allocator manages storage may be a system program such as a file server, or even an operating system kernel.
9 there no provably good allocation algorithms, there are proofs that any allocator will be "bad" for some possible applications. The lower bound on worst case fragmentation is generally proportional to the amount of live data 11 multiplied by the logarithm of the ratio between the largest and smallest block sizes, i.e., M log S n, where M is the amount of live data and n is the ratio between the smallest and largest object sizes [RobT]]. (In discussing worst-case memory costs, we generally assume that all block sizes are evenly divisible by the smallest block size, and n is sometimes simply called "the largest block size," i.e., in units of the smallest.) Of course, for some algorithms, the worst case is much worse, often proportional to the simple product of M and n. So, for example, if the minimum and maximum objects sizes are one word and a million words, then fragmentation in the worst case may cost an excellent allocator a factor of ten or twenty in space. A less robust allocator may lose a factor of a million, in its worst case, wasting so much space that failure is almost certain. Given the apparent insolubility of this problem, it may seem surprising that dynamic memory allocation is used in most systems, and the computing world does not grind to a halt due to lack of memory. The reason, of course, is that there are allocators that are fairly good in practice, in combination with most actual programs. Some allocation algorithms have been shown in practice to work acceptably well with real programs, and have been widely adopted. If a particular program interacts badly with a particular allocator, a different allocator may be used instead. (The bad cases for one allocator may be very different from the bad cases for other allocators of different design.) The design of memory allocators is currently something of a black art. Little is known about the interactions between programs and allocators, and which programs are likely to bring out the worst in which allocators. However, one thing is clear--most programs are "well behaved" in some sense. Most programs combined with most common allocators do not squander huge amounts of memory, even if they may waste a quarter of it, or a half, or occasionally even more. That is, there are regularities in program behavior that allocators exploit, a point that is often insufficiently appreciated even by professionals who design and implement allocators. These regularities are exploited by allocators to prevent excessive fragmentation, and make it possible for allocators to work in practice. These regularities are surprisingly poorly understood, despite 35 years of allocator research, and scores of papers by dozens of researchers. 1.3 Strategies, Placement Policies, and Splitting and Coalescing The main technique used by allocators to keep fragmentation under control is placement choice. Two subsidiary techniques are used to help implement that 11 We use "live" here in a different sense from that used in garbage collection or in compiler flow analysis. Blocks are "live" from the point of view of the allocator if it doesn't know that it can safely reuse the storage--i.e., if the block was allocated but not yet freed.
10 10 choice: splitting blocks to satisfy smaller requests, and coalescing of free blocks to yield larger blocks. Placement choice is simply the choosing of where in free memory to put a requested block. Despite potentially fatal restrictions on an allocator's online choices, the allocator also has a huge freedom of action--it can place a requested block anywhere it can find a sufficiently large range of free memory, and anywhere within that range. (It may also be able to simply request more memory from the operating system.) An allocator algorithm therefore should be regarded as the mechanism that implements a placement policy, which is motivated by a strategy for minimizing fragmentation. Strategy, policy, and mechanism. The strategy takes into account regularities in program behavior, and determines a range of acceptable policies as to where to allocate requested blocks. The chosen policy is implemented by a mechanism, which is a set of algorithms and the data structures they use. This three-level distinction is quite important. In the context of general memory allocation, - a strategy attempts to exploit regularities in the request stream, - a policyis an implementable decision procedure for placing blocks in memory, and - a mechanism is a set of algorithms and data structures that implement the policy, often over-simply called "an algorithm." 12 An ideal strategy is "put blocks where they won't cause fragmentation later"; unfortunately that's impossible to guarantee, so real strategies attempt to heuristically approximate that ideal, based on assumed regularities of application programs' behavior. For example, one strategy is "avoid letting small long-lived 12 This set of distinctions is doubtless indirectly influenced by work in very different areas, notably Marr's work in natural and artificial visual systems [Mar82] and Mc- Clamrock's work in the philosophy of science and cognition [McC91, McC95]. The distinctions are important for understanding a wide variety of complex systems, however. Similar distinctions are made in many fields, including empirical computer science, though often without making them quite clear. In "systems" work, mechanism and policy are often distinguished, but strategy and policy are usually not distinguished explicitly. This makes sense in some contexts, where the policy can safely be assumed to implement a well-understood strategy, or where the choice of strategy is left up to someone else (e.g., designers of higher-level code not under discussion). In empirical evaluations of very poorly understood strategies, however, the distinction between strategy and policy is often crucial. (For example, errors in the implementation of a strategy are often misinterpreted as evidence that the expected regularities don't actually exist, when in fact they do, and a slightly different strategy would work much better.) Mistakes are possible at each level; equally important, mistakes are possible between levels, in the attempt to "cash out" (implement) the higher-level strategy as a policy, or a policy as a mechanism.
11 ]1 objects prevent you from reclaiming a larger contiguous free area." This is part of the strategy underlying the common "best fit" family of policies. Another part of the strategy is "if you have to split a block and potentially waste what's left over, minimize the size of the wasted part." The corresponding (best fit) policy is more concrete--it says "always use the smallest block that is at least large enough to satisfy the request." The placement policy determines exactly where in memory requested blocks will be allocated. For the best fit policies, the general rule is "allocate objects in the smallest free block that's at least big enough to hold them." That's not a complete policy, however, because there may be several equally good fits; the complete policy must specify which of those should be chosen, for example, the one whose address is lowest. The chosen policy is implemented by a specific mechanism, chosen to implement that policy efficiently in terms of time and space overheads. For best fit, a linear list or ordered tree structure might be used to record the addresses and sizes of free blocks, and a tree search or list search would be used to find the one dictated by the policy. These levels of the allocator design process interact. A strategy may not yield an obvious complete policy, and the seemingly slight differences between similar policies may actually implement interestingly different strategies. (This results from our poor understanding of the interactions between application behavior and allocator strategies.) The chosen policy may not be obviously implementable at reasonable cost in space, time, or programmer effort; in that case some approximation may be used instead. The strategy and policy are often very poorly-defined, as well, and the policy and mechanism are arrived at by a combination of educated guessing, trial and error, and (often dubious) experimental validation In case the important distinctions between strategy, policy, and mechanism are not clear, a metaphorical example may help. Consider a software company that has a strategy for improving productivity: rewarding the most productive programmers. It may institute a policy of rewarding programmers who produce the largest numbers of lines of program code. To implement this policy, it may use the mechanisms of instructing the managers to count lines of code, and providing scripts that count lines of code according to some particular algorithm. This example illustrates the possible failures at each level, and in the mapping from one level to another. The strategy may simply be wrong, if programmers aren't particularly motivated by money. The policy may not implement the intended strategy, if lines of code are an inappropriate metric of productivity, or if the policy has unintended "strategic" effects, e.g., due to programmer resentment. The mechanism may also fail to implement the specified policy, if the rules for line-counting aren't enforced by managers, or if the supplied scripts don't correctly implement the intended counting function. This distinction between strategy and policy is oversimplified, because there may be multiple levels of strategy that shade off into increasingly concrete policies. At different levels of abstraction, something might be viewed as a strategy or policy. The key point is that there are at least three qualitatively different kinds of levels
12 ]2 Splitting and coalescing Two general techniques for supporting a range of (implementations of) placement policies are splitting and coalescing of free blocks. (These mechanisms are important subidiary parts of the larger mechanism that is the allocator implementation.) The allocator may split large blocks into smaller blocks arbitrarily, and use any sufficiently-large subblock to satisfy the request. The remainders from this splitting can be recorded as smaller free blocks in their own right and used to satisfy future requests. The allocator may also coalesce (merge) adjacent free blocks to yield larger free blocks. After a block is freed, the allocator may check to see whether the neighboring blocks are free as well, and merge them into a single, larger block. This is often desirable, because one large block is more likely to be useful than two small ones--large or small requests can be satisfied from large blocks. Completely general splitting and coalescing can be supported at fairly modest cost in space and/or time, using simple mechanisms that we'll describe later. This Mlows the allocator designer the maximum freedom in choosing a strategy, policy, and mechanism for the allocator, because the allocator can have a complete and accurate record of which ranges of memory are available at all times. The cost may not be negligible, however, especially if splitting and coalescing work too well--in that case, freed blocks will usually be coalesced with neighbors to form large blocks of free memory, and later allocations will have to split smaller chunks off of those blocks to obtained the desired sizes. It often turns out that most of this effort is wasted, because the sizes requested later are largely the same as the sizes freed earlier, and the old small blocks could have been reused without coalescing and splitting. Because of this, many modern allocators use deferred coalescing--they avoid coalescing and splitting most of the time, but use intermittently, to combat fragmentation. 2 A Closer Look at Fragmentation, and How to Study It In this section, we will discuss the traditional conception of fragmentation, and the usual techniques used for studying it. We will then explain why the usual of abstraction involved [McC91]; at the upper levels, there are is the general design goal of exploiting expected regularities, and a set of strategies for doing so; there may be subsidiary strategies, for example to resolve conflicts between strategies in the best possible way. At at a somewhat lower level there is a general policy of where to place objects, and below that is a more detailed policy that exactly determines placement: Below that there is an actual mechanism that is intended to implement the policy (and presumably effect the strategy), using whatever algorithms and data structures are deemed appropriate. Mechanisms are often layered, as well, in the usual manner of structured programming [Dij69]. Problems at (and between) these levels are the best understood--an algorithm may not implement its specification, or may be improperly specified. (Analogous problems occur at the upper levels occur as well--if expected regularities don't actually occur, or if they do occur but the strategy does't actually exploit them, and so on.)
13 13 understanding is not strong enough to support scientific design and evaluation of allocators. We then propose a new (though nearly obvious) conception of fragmentation and its causes, and describe more suitable techniques used to study it. (Most of the experiments using sound techniques have been performed in the last few years, but a few notable exceptions were done much earlier, e.g., [MPS71] and [LH82], discussed in Section 4.) 2.1 Internal and External Fragmentation Traditionally, fragmentation is classed as external or internal [Ran69], and is combatted by splitting and coalescing free blocks. External fragmentation arises when free blocks of memory are available for allocation, but can't be used to hold objects of the sizes actually requested by a program. In sophisticated allocators, that's usually because the free blocks are too small, and the program requests larger objects. In some simple allocators, external fragmentation can occur because the allocator is unwilling or unable to split large blocks into smaller ones. Internal fragmentation arises when a large-enough free block is allocated to hold an object, but there is a poor fit because the block is larger than needed. In some allocators, the remainder is simply wasted, causing internal fragmentation. (It's called internal because the wasted memory is inside an allocated block, rather than being recorded as a free block in its own right.) To combat internal fragmentation, most allocators will split blocks into multiple parts, allocating part of a block, and then regarding the remainder as a smaller free block in its own right. Many allocators will also coalesce adjacent free blocks (i.e., neighboring fi'ee blocks in address order), combining them into larger blocks that can be used to satisfy requests for larger objects. In some allocators, internal fragmentation arises due to implementation constraints within the allocator--for speed or simplicity reasons, the allocator design restricts the ways memory may be subdivided. In other allocators, internal fragmentation may be accepted as part of a strategy to prevent external fragmentation-the allocator may be unwilling to fragment a block, because if it does, it may not be able to coalesce it again later and use it to hold another large object. 2.2 The Traditional Methodology: Probabilistic Analyses, and Simulation Using Synthetic Traces (Note: readers who are uninterested in experimental methodology may wish to skip this section, at least on a first reading. Readers uninterested in the history of allocator research may skip the footnotes. The following section (2.3) is quite important, however, and should not be skipped.) Allocators are sometimes evaluated using probabilistic analyses. By reasoning about the likelihood of certain events, and the consequences of those events for future events, it may be possible to predict what will happen on average. For the
14 14 general problem of dynamic storage allocation, however, the mathematics are too difficult to do this for most algorithms and most workloads. An alternative is to do simulations, and find out "empirically" what really happens when workloads interact with allocator policies. This is more common, because the interactions are so poorly understood that mathematical techniques are difficult to apply. Unfortunately, in both cases, to make probabilistic techniques feasible, important characteristics of the workload must be known--i.e., the probabilities of relevant characteristics of "input" events to the allocation routine. The relevant characteristics are not understood, and so the probabilities are simply unknown. This is one of the major points of this paper. The paradigm of statistical mechanics has been used in theories of memory allocation, but we believe that it is the wrong paradigm, at least as it is usually applied. Strong assumptions are made that frequencies of individual events (e.g., allocations and deallocations) are the base statistics from which probabilistic models should be developed, and we think that this is false. The great success of statistical mechanics in other areas is due to the fact that such assumptions make sense there. Gas laws are pretty good idealizations, because aggregate effects of a very large number of individual events (e.g., collisions between molecules) do concisely express the most important regularities. This paradigm is inappropriate for memory allocation, for two reasons. The first is simply that the number of objects involved is usually too small for asymptotic analyses to be relevant, but this is not the most important reason. The main weakness of the statistical mechanics approach is that there are important systematic interactions that occur in memory allocation, due to phase behavior of programs. No matter how large the system is, basing probabilistic analyses on individual events is likely to yield the wrong answers, if there are systematic effects involved which are not captured by the theory. Assuming that the analyses are appropriate for "sufficiently large" systems does not help here-- the systematic errors will simply attain greater statistical significance. Consider the case of evolutionary biology. If a overly simple statistical approach about individual animals' interactions is used, the theory will not capture predator/prey and host/symbiote relationships, sexual selection, or other pervasive evolutionary effects as niche filling3 4 Developing a highly predictive evolutionary theory is extremely difficult--and some would say impossible--because too many low-level details matter, 15 and there may intrinsic unpredictabilities in the systems described3 6 We are not saying that the development of a good theory of memory allocation is as hard as developing a predictive evolutionary theory--far from it. The 14 Some of these effects may emerge from lower-level modeling, but for simulations to reliably predict them, many important lower-level issues must be modeled correctly, and sufficient data are usually not available, or sufficiently understood. 15 For example, the different evolutionary strategies implied by the varying replication techniques and mutation rates of RNA-based vs. DNA-based viruses. 16 For example, a single mutation that results in an adaptive characteristic in one individual may have a major impact on the subsequent evolution of a species and its entire ecosystem.
15 15 problem of memory allocation seems far simpler, and we are optimistic that a useful predictive theory can be developed. Our point is simply that the paradigm of simple statistical mechanics must be evaluated relative to other alternatives, which we find more plausible in this domain. There are major interactions between workloads and allocator policies, which are usually ignored. No matter how large the system, and no matter how asymptotic the analyses, ignoring these effects seems likely to yield major errors--e.g., analyses will simply yield the wrong asymptotes. A useful probabilistic theory of memory allocation may be possible, but if so, it will be based on a quite different set of statistics from those used so far-- statistics which capture effects of systematicities, rather than assuming such systematicities can be ignored. As in biology, the theory must be tested against reality, and refined to capture systematicities that had previously gone unnoticed. Random simulations. The traditional technique for evaluating allocators is to construct several traces (recorded sequences of allocation and deallocation requests) thought to resemble "typical" workloads, and use those traces to drive a variety of actual allocators. Since an allocator normally responds only to the request sequence, this can produce very accurate simulations of what the allocator would do if the workload were real--that is, if a real program that generated that request sequence. Typically, however, the request sequences are not real traces of the behavior of actual programs. They are "synthetic" traces that are generated automatically by a small subprogram; the subprogram is designed to resemble real programs in certain statistical ways. In particular, object size distributions are thought to be important, because they affect the fragmentation of memory into blocks of varying sizes. Object lifetime distributions are also often thought to be important (but not always), because they affect whether blocks of memory are occupied or free. Given a set of object size and lifetime distributions, the small "driver" subprogram generates a sequence of requests that obeys those distributions. This driver is simply a loop that repeatedly generates requests, using a pseudo-random number generator; at any point in the simulation, the next data object is chosen by "randomly" picking a size and lifetime, with a bias that (probabilistically) preserves the desired distributions. The driver also maintains a table of objects that have been allocated but not yet freed, ordered by their scheduled death (deallocation) time. (That is, the step at which they were allocated, plus their randomly-chosen lifetime.) At each step of the simulation, the driver deallocates any objects whose death times indicate that they have expired. One convenient measure of simulated "time" is the volume of objects allocated so far--i.e., the sum of the sizes of objects that have been allocated up to that step of the simulation In many early simulations, the simulator modeled real time, rather than just discrete steps of allocation and dealloeation. Allocation times were chosen based on
16 16 An important feature of these simulations is that they tend to reach a "steady state." After running for a certain amount of time, the volume of live (simulated) objects reaches a level that is determined by the size and lifetime distributions, and after that objects are allocated and deallocated in approximately equal numbers. The memory usage tends to vary very little, wandering probabilistically (in a random walk) around this "most likely" level. Measurements are typically made by sampling memory usage at points after the steady state has presumably been reached, or by averaging over a period of "steady-state" variation. These measurements "at equilibrium" are assumed to be important. There are three common variations of this simulation technique. One is to use a simple mathematical function to determine the size and lifetime distributions, such as uniform or (negative) exponentim. Exponential distributions are often used because it has been observed that programs are typically more likely to allocate small objects than large ones, is and are more likely to Mlocate short-lived objects than long-lived ones. 19 (The size distributions are generally truncated at some plausible minimum and maximum object size, and discretized, rounding them to the nearest integer.) The second variation is to pick distributions intuitively, i.e., out of a hat, but in ways thought to resemble real program behavior. One motivation for this is to model the fact that many programs allocate objects of some sizes and others in small numbers or not at all; we refer to these distributions as "spiky. "2~ The third variation is to use statistics gathered from real programs, to make the distributions more realistic. In almost all cases, size and lifetime distributions are assumed to be independent--the fact that different sizes of objects may have different lifetime distributions is generally assumed to be unimportant. In general, there has been something of a trend toward the use of more real- randomly chosen "arrival" times, generated using an "interarrival distribution" and their deaths scheduled in continuous time rather than discrete time based on the number and/or sizes of objects allocated so far. We will generally ignore this distinction in this paper, because we ttfink other issues are more important. As will become clear, in the methodology we favor, this distinction is not important because the actual sequences of actions are sufficient to guarantee exact simulation, and the actual sequence of events is recorded rather than being (approximately) emulated. 18 Historically, uniform size distributions were the most common in early experiments; exponential distributions then became increasingly common, as new data became available showing that real systems generally used many more small objects than large ones. Other distributions have also been used, notably Poisson and hyperexponential. Still, relatively recent papers have used uniform size distributions, sometimes as the only distribution. 19 As with size distributions, there has been a shift over time toward non-uniform lifetime distributions, often exponential. This shift occurred later, probably because real data on size information was easier to obtain, and lifetime data appeared later. ~0 In general, this modeling has not been very precise. Sometimes the sizes chosen out of a hat are allocated in uniform proportions, rather than in skewed proportions reflecting the fact that (on average) programs allocate many more small objects than large ones.
17 17 istic distributions, 21 but this trend is not dominant. Even now, researchers often use simple and smooth mathematical functions to generate traces for allocator evaluation. 2~ The use of smooth distributions is questionable, because it bears directly on issues of fragmentation--if objects of only a few sizes are allocated, the free (and uncoalescable) blocks are likely to be of those sizes, making it possible to find a perfect fit. If the object sizes are smoothly distributed, the requested sizes will almost always be slightly different, increasing the chances of fragmentation. Probabilistic analyses. Since Knuth's derivation of the "fifty percent rule" [Knu73] (discussed later, in Section 4), there have been many attempts to reason probabilistically about the interactions between program behavior and allocator policy, and assess the overall cost in terms of fragmentation (usually) and/or CPU time. These analyses have generally made the same assumptions as random-trace simulation experiments--e.g., random object allocation order, independence of size and lifetimes, steady-state behavior--and often stronger assumptions as well. These simplifying assumptions have generally been made in order to make the mathematics tractable. In particular, assumptions of randomness and independence make it possible to apply well-developed theory of stochastic processes (Markov models, etc.) to derive analytical results about expected behavior. Unfortunately, these assumptions tend to be false for most real programs, so the results are of limited utility. It should be noted that these are not merely convenient simplifying assumptions that allow solution of problems that closely resemble real problems. If that were the case, one could expect that with refinement of the analyses--or with sufficient empirical validation that the assumptions don't matter in practice-- the results would come close to reality. There is no reason to expect such a happy outcome. These assumptions dramatically change the key features of the problem; the ability to perform the analyses hinges on the very facts that make them much less relevant to the general problem of memory allocation. Assumptions of randomness and independence make the problem irregular, in a superficial sense, but they make it very smooth (hence mathematically 21 The trend toward more realistic distributions can be explained historically and pragmatically. In the early clays of computing, the distributions of interest were usually the distribution of segment sizes in an operating system's workload. Without access to the inside of an operating system, this data was difficult to obtain. (Most researchers would not have been allowed to modify the implementation of the operating system running on a very valuable and heavily-timeshared computer.) Later, the emphasis of study shifted away from segment sizes in segmented operating systems, and toward data object sizes in the virtual memories of individual processes running in paged virtual memories. 22 We are unclear on why this should be, except that a particular theoretical and experimental paradigm [KuhT0] had simply become thoroughly entrenched by the early 1970's. (It's also somewhat easier than dealing with real data.)
18 18 tractable) in a probabilistic sense. This smoothness has the advantage that it makes it possible to derive analytical results, but it has the disadvantage that it turns a real and deep scientific problem into a mathematical puzzle that is much less significant for our purposes. The problem of dynamic storage allocation is intractable, in the vernacular sense of the word. As an essentially data-dependent problem, we do not have a grip on it, because we simply do not understand the inputs. "Smoothing" the problem to make it mathematically tractable "removes the handles" from something that is fundamentally irregular, making it unlikely that we will get any real purchase or leverage on the important issues. Removing the irregularities removes some of the problems--and most of the opportunities as well. A note on exponentially-distributed random lifetimes. Exponential lifetime distributions have become quite common in both empirical and analytic studies of memory fragmentation over the last two decades. In the case of empirical work (using random-trace simulations), this seems an admirable adjustment to some observed characteristics of real program behavior. In the case of analytic studies, it turns out to have some very convenient mathematical properties as well. Unfortunately, it appears that the apparently exponential appearence of real lifetime distributions is often an artifact of experimental methodology (as will be explained in Sections 2.3 and 4.1) and that the emphasis on distributions tends to distract researchers from the strongly patterned underlying processes that actually generate them (as will be explained in Section 2.4). We invite the reader to consider a randomly-ordered trace with an exponential lifetime distribution. In this case there is no correlation at all between an object's age and its expected time until death--the "half-life" decay property of the distribution and the randomness ensure that allocated objects die completely at random with no way to estimate their death times from any of the information available to the allocator. 23 (An exponential random function exhibits only a half-life property, and no other pattern, much like radioactive decay.) In a sense, exponential lifetimes are thus the reductio ad absuvdum of the synthetic trace methodology--all of the time-varying regularities have been systematically eliminated from the input. If we view the allocator's job as an online problem of detecting and exploiting regularities, we see that this puts the allocator in the awkward position of trying to extract helpful hints from pure noise. This does not necessarily mean that all allocators will perform identically under randomized workloads, however, because there are regularities in size distributions, whether they are real distributions or simple mathematical ones, and some allocators may simply shoot themselves in the foot. Analyses and experiments with exponentially distributed random lifetimes may say something revealing about what happens when an allocator's strategy is completely orthogonal to the actual regularities. We have no real idea whether 23 We are indebted to Henry Baker, who has made quite similar observations with respect to the use of exponential hfetime distributions to estimate the effectiveness of generational garbage collection schemes [Bak93].
19 19 this is a situation that occurs regularly in the space of possible combinations of real workloads and reasonable strategies. (It's clear that it is not the usual case, however.) The terrain of that space is quite mysterious to us. A note on Markov models. Many probabilistic studies of memory allocation have used first-order Markov processes to approximate program and Mlocator behavior, and have derived conclusions based on the well-understood properties of Markov models. In a first-order Markov model, the probabilities of state transitions are known and fixed. In the case of fragmentation studies, this corresponds to assuming that a program allocates objects at random, with fixed probabilities of allocating different sizes. The space of possible states of memory is viewed as a graph, with a node for each configuration. There is a start state, representing an empty memory, and a transition probability for each possible allocation size. For a given placement policy, there will be a known transition from a given state for any possible allocation or deallocation request. The state reached by each possible allocation is another configuration of memory. For any given request distribution, there is a network of possible states reachable from the start state, via successions of more or less probable transitions. In general, for any memory above a very, very smm1 size, and for arbitrary distributions of sizes and lifetimes, this network is inconceivably large. As described so far, it is therefore useless for any practical analyses. To make the problem more tractable, certain assumptions are often made. One of these is that lifetimes are exponentially distributed as well as random, and have the convenient half-life property described above, i.e., they die completely at random as well as being born at random. This assumption can be used to ensure that both the states and the transitions between states have definite probabilities in the long run. That is, if one were to run a random-trace simulation for a long enough period of time, all reachable states would be reached, and all of them would be reached many times--and the number of times they were reached would reflect the probabilities of their being reached again in the future, if the simulation were continued indefinitely. If we put a counter on each of the states to keep track of the number of times each state was reached, the ratio between these counts would eventually stabilize, plus or minus small short-term variations. The relative weights of the counters would "converge" to a stable solution. Such a network of states is called an ergodic Markov model, and it has very convenient mathematical properties. In some cases, it's possible to avoid running a simulation at all, and analytically derive what the network's probabiblities would converge to. Unfortunately, this is a very inappropriate model for real program and allocator behavior. An ergodic Markov model is a kind of (probabilistic) finite automaton, and as such the patterns it generates are very, very simple, though randomized and hence unpredictable. They're almost unpatterned, in fact, and hence very predictable in a certain probabilistic sense.
20 20 Such an automaton is extremely unlikely to generate many patterns that seem likely to be important in real programs, such as the creation of the objects in a linked list in one order, and their later destruction in exactly the same order, or exactly the reverse order. 24 There are much more powerful kinds of machines--which have more complex state, like a real program--which are capable of generating more realistic patterns. Unfortunately, the only machines that we are sure generate the "right kinds" of patterns are actual real programs. We do not understand what regularities exist in real programs well enough to model them formally and perform probabilistic analyses that are directly applicable to real program behavior. The models we have are grossly inaccurate in respects that are quite relevant to problems of memory allocation. There are problems for which Markov models are useful, and a smaller number of problems where assumptions of ergodicity are appropriate. These problems involve processes that are literally random, or can be shown to be effectively random in the necessary ways. The general heap allocation problem is not in either category. (If this is not clear, the next section should make it much clearer.) Ergodic Markov models are also sometimes used for problems where the basic assumptions are known to be false in some cases--but they should only be used in this way if they can be validated, i.e., shown by extensive testing to produce the right answers most of the time, despite the oversimplifications they're based on. For some problems it "just turns out" that the differences between real systems and the mathematical models are not usually significant. For the general problem of memory allocation, this turns out to be false as well--recent results clearly invalidate the use of simple Markov models [ZG94, WJNB95] Technically, a Markov model will eventually generate such patterns, but the probability of generating a particular pattern within a finite period of time is vanishingly small if the pattern is large and not very strongly reflected in the arc weights. That is, many quite probable kinds of patterns are extremely improbable in a simple Markov model. 25 It might seem that the problem here is the use of first-order Markov models, whose states (nodes in the reachability graph) correspond directly to states of memory, and that perhaps "higher-order" Markov models would work, where nodes in the graph represent sequences of concrete state transitions. However, we do not believe these higher-order models will work any better than first-order models do. The important kinds of patterns produced by real programs are generally not simple very-short-term sequences of a few events, but large-scale patterns involving many events. To capture these, a Markov model would have to be of such high order that analyses would be completely infeasible. It would essentially have to be pre-programmed to generate specific literal sequences of events. This not only begs the essential question of what real programs do, but seems certain not to concisely capture the right regularities. Markov models are simply not powerful enough--i.e., not abstract enough in the right ways--to help with this problem. They should not be used for this purpose, or any similarly poorly understood purpose, where complex patterns may be very important. (At least, not without extensive validation.) The fact that the regularities are complex and unknown is not a good reason to assume that they're effectively random [ZG94, WJNB95] (Section 4.2).
Paul R. Wilson, Mark S. Johnstone, Michael Neely, anddavid Boles??? Department of Computer Sciences. University of Texas at Austin
Dynamic Storage Allocation: A Survey and Critical Review??? Paul R. Wilson, Mark S. Johnstone, Michael Neely, anddavid Boles??? Department of Computer Sciences University of Texas at Austin Austin, Texas,
More informationA COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES
A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES ULFAR ERLINGSSON, MARK MANASSE, FRANK MCSHERRY MICROSOFT RESEARCH SILICON VALLEY MOUNTAIN VIEW, CALIFORNIA, USA ABSTRACT Recent advances in the
More informationModule 10. Coding and Testing. Version 2 CSE IIT, Kharagpur
Module 10 Coding and Testing Lesson 23 Code Review Specific Instructional Objectives At the end of this lesson the student would be able to: Identify the necessity of coding standards. Differentiate between
More informationLecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc()
CS61: Systems Programming and Machine Organization Harvard University, Fall 2009 Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc() Prof. Matt Welsh October 6, 2009 Topics for today Dynamic
More informationWhat mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL
What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the
More informationIntroduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral
Click here to print this article. Re-Printed From SLCentral RAID: An In-Depth Guide To RAID Technology Author: Tom Solinap Date Posted: January 24th, 2001 URL: http://www.slcentral.com/articles/01/1/raid
More informationSegmentation and Fragmentation
Segmentation and Fragmentation Operating System Design MOSIG 1 Instructor: Arnaud Legrand Class Assistants: Benjamin Negrevergne, Sascha Hunold September 16, 2010 A. Legrand Segmentation and Fragmentation
More information1. Comments on reviews a. Need to avoid just summarizing web page asks you for:
1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of
More informationPerformance Workload Design
Performance Workload Design The goal of this paper is to show the basic principles involved in designing a workload for performance and scalability testing. We will understand how to achieve these principles
More information6 Scalar, Stochastic, Discrete Dynamic Systems
47 6 Scalar, Stochastic, Discrete Dynamic Systems Consider modeling a population of sand-hill cranes in year n by the first-order, deterministic recurrence equation y(n + 1) = Ry(n) where R = 1 + r = 1
More informationPART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design
PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General
More informationAachen Summer Simulation Seminar 2014
Aachen Summer Simulation Seminar 2014 Lecture 07 Input Modelling + Experimentation + Output Analysis Peer-Olaf Siebers pos@cs.nott.ac.uk Motivation 1. Input modelling Improve the understanding about how
More informationGarbage Collection in the Java HotSpot Virtual Machine
http://www.devx.com Printed from http://www.devx.com/java/article/21977/1954 Garbage Collection in the Java HotSpot Virtual Machine Gain a better understanding of how garbage collection in the Java HotSpot
More informationCompass Interdisciplinary Virtual Conference 19-30 Oct 2009
Compass Interdisciplinary Virtual Conference 19-30 Oct 2009 10 Things New Scholars should do to get published Duane Wegener Professor of Social Psychology, Purdue University Hello, I hope you re having
More informationLinear Motion and Assembly Technologies Pneumatics Service. Understanding the IEC61131-3 Programming Languages
Electric Drives and Controls Hydraulics Linear Motion and Assembly Technologies Pneumatics Service profile Drive & Control Understanding the IEC61131-3 Programming Languages It was about 120 years ago
More informationInflation. Chapter 8. 8.1 Money Supply and Demand
Chapter 8 Inflation This chapter examines the causes and consequences of inflation. Sections 8.1 and 8.2 relate inflation to money supply and demand. Although the presentation differs somewhat from that
More informationMeasuring performance in credit management
Measuring performance in credit management Ludo Theunissen Prof. Ghent University Instituut voor Kredietmanagement e-mail: ludo.theunissen@ivkm.be Josef Busuttil MBA (Henley); DipM MCIM; FICM Director
More informationProject Planning and Project Estimation Techniques. Naveen Aggarwal
Project Planning and Project Estimation Techniques Naveen Aggarwal Responsibilities of a software project manager The job responsibility of a project manager ranges from invisible activities like building
More informationUnderstanding the IEC61131-3 Programming Languages
profile Drive & Control Technical Article Understanding the IEC61131-3 Programming Languages It was about 120 years ago when Mark Twain used the phrase more than one way to skin a cat. In the world of
More informationWhat Is Specific in Load Testing?
What Is Specific in Load Testing? Testing of multi-user applications under realistic and stress loads is really the only way to ensure appropriate performance and reliability in production. Load testing
More informationOutline. 1 Denitions. 2 Principles. 4 Implementation and Evaluation. 5 Debugging. 6 References
Outline Computer Science 331 Introduction to Testing of Programs Mike Jacobson Department of Computer Science University of Calgary Lecture #3-4 1 Denitions 2 3 4 Implementation and Evaluation 5 Debugging
More informationCompact Representations and Approximations for Compuation in Games
Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions
More informationComparing Alternate Designs For A Multi-Domain Cluster Sample
Comparing Alternate Designs For A Multi-Domain Cluster Sample Pedro J. Saavedra, Mareena McKinley Wright and Joseph P. Riley Mareena McKinley Wright, ORC Macro, 11785 Beltsville Dr., Calverton, MD 20705
More informationProcess Intelligence: An Exciting New Frontier for Business Intelligence
February/2014 Process Intelligence: An Exciting New Frontier for Business Intelligence Claudia Imhoff, Ph.D. Sponsored by Altosoft, A Kofax Company Table of Contents Introduction... 1 Use Cases... 2 Business
More informationCHAPTER - 5 CONCLUSIONS / IMP. FINDINGS
CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS In today's scenario data warehouse plays a crucial role in order to perform important operations. Different indexing techniques has been used and analyzed using
More informationWHITE PAPER. Understanding IP Addressing: Everything You Ever Wanted To Know
WHITE PAPER Understanding IP Addressing: Everything You Ever Wanted To Know Understanding IP Addressing: Everything You Ever Wanted To Know CONTENTS Internet Scaling Problems 1 Classful IP Addressing 3
More informationMemory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation
Dynamic Storage Allocation CS 44 Operating Systems Fall 5 Presented By Vibha Prasad Memory Allocation Static Allocation (fixed in size) Sometimes we create data structures that are fixed and don t need
More informationOperating Systems, 6 th ed. Test Bank Chapter 7
True / False Questions: Chapter 7 Memory Management 1. T / F In a multiprogramming system, main memory is divided into multiple sections: one for the operating system (resident monitor, kernel) and one
More informationModule 11. Software Project Planning. Version 2 CSE IIT, Kharagpur
Module 11 Software Project Planning Lesson 27 Project Planning and Project Estimation Techniques Specific Instructional Objectives At the end of this lesson the student would be able to: Identify the job
More informationModule 2. Software Life Cycle Model. Version 2 CSE IIT, Kharagpur
Module 2 Software Life Cycle Model Lesson 4 Prototyping and Spiral Life Cycle Models Specific Instructional Objectives At the end of this lesson the student will be able to: Explain what a prototype is.
More informationMoral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania
Moral Hazard Itay Goldstein Wharton School, University of Pennsylvania 1 Principal-Agent Problem Basic problem in corporate finance: separation of ownership and control: o The owners of the firm are typically
More informationChapter 12 File Management
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Roadmap Overview File organisation and Access
More informationVirtual Routing: What s The Goal? And What s Beyond? Peter Christy, NetsEdge Research Group, August 2001
Virtual Routing: What s The Goal? And What s Beyond? Peter Christy, NetsEdge Research Group, August 2001 Virtual routing is a software design method used to provide multiple independent routers that share
More informationChapter 12 File Management. Roadmap
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Overview Roadmap File organisation and Access
More informationFile Management. Chapter 12
Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution
More informationThere are a number of factors that increase the risk of performance problems in complex computer and software systems, such as e-commerce systems.
ASSURING PERFORMANCE IN E-COMMERCE SYSTEMS Dr. John Murphy Abstract Performance Assurance is a methodology that, when applied during the design and development cycle, will greatly increase the chances
More informationPractical Calculation of Expected and Unexpected Losses in Operational Risk by Simulation Methods
Practical Calculation of Expected and Unexpected Losses in Operational Risk by Simulation Methods Enrique Navarrete 1 Abstract: This paper surveys the main difficulties involved with the quantitative measurement
More information54 Robinson 3 THE DIFFICULTIES OF VALIDATION
SIMULATION MODEL VERIFICATION AND VALIDATION: INCREASING THE USERS CONFIDENCE Stewart Robinson Operations and Information Management Group Aston Business School Aston University Birmingham, B4 7ET, UNITED
More information1-04-10 Configuration Management: An Object-Based Method Barbara Dumas
1-04-10 Configuration Management: An Object-Based Method Barbara Dumas Payoff Configuration management (CM) helps an organization maintain an inventory of its software assets. In traditional CM systems,
More informationConcept of Cache in web proxies
Concept of Cache in web proxies Chan Kit Wai and Somasundaram Meiyappan 1. Introduction Caching is an effective performance enhancing technique that has been used in computer systems for decades. However,
More informationRafael Witten Yuze Huang Haithem Turki. Playing Strong Poker. 1. Why Poker?
Rafael Witten Yuze Huang Haithem Turki Playing Strong Poker 1. Why Poker? Chess, checkers and Othello have been conquered by machine learning - chess computers are vastly superior to humans and checkers
More informationAmajor benefit of Monte-Carlo schedule analysis is to
2005 AACE International Transactions RISK.10 The Benefits of Monte- Carlo Schedule Analysis Mr. Jason Verschoor, P.Eng. Amajor benefit of Monte-Carlo schedule analysis is to expose underlying risks to
More informationContributions to Gang Scheduling
CHAPTER 7 Contributions to Gang Scheduling In this Chapter, we present two techniques to improve Gang Scheduling policies by adopting the ideas of this Thesis. The first one, Performance- Driven Gang Scheduling,
More informationUsing simulation to calculate the NPV of a project
Using simulation to calculate the NPV of a project Marius Holtan Onward Inc. 5/31/2002 Monte Carlo simulation is fast becoming the technology of choice for evaluating and analyzing assets, be it pure financial
More informationUniversität Karlsruhe (TH) Forschungsuniversität gegründet 1825. Inheritance Depth as a Cost Factor in Maintenance
Universität Karlsruhe (TH) Forschungsuniversität gegründet 1825 Why is Inheritance Important? A Controlled Experiment on Inheritance Depth as a Cost Factor in Maintenance Walter F. Tichy University of
More informationFile-System Implementation
File-System Implementation 11 CHAPTER In this chapter we discuss various methods for storing information on secondary storage. The basic issues are device directory, free space management, and space allocation
More informationTopics in Computer System Performance and Reliability: Storage Systems!
CSC 2233: Topics in Computer System Performance and Reliability: Storage Systems! Note: some of the slides in today s lecture are borrowed from a course taught by Greg Ganger and Garth Gibson at Carnegie
More informationFactoring & Primality
Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount
More informationHow to handle Out-of-Memory issue
How to handle Out-of-Memory issue Overview Memory Usage Architecture Memory accumulation 32-bit application memory limitation Common Issues Encountered Too many cameras recording, or bitrate too high Too
More informationSoftware Engineering Introduction & Background. Complaints. General Problems. Department of Computer Science Kent State University
Software Engineering Introduction & Background Department of Computer Science Kent State University Complaints Software production is often done by amateurs Software development is done by tinkering or
More informationMuse Server Sizing. 18 June 2012. Document Version 0.0.1.9 Muse 2.7.0.0
Muse Server Sizing 18 June 2012 Document Version 0.0.1.9 Muse 2.7.0.0 Notice No part of this publication may be reproduced stored in a retrieval system, or transmitted, in any form or by any means, without
More information1 The Java Virtual Machine
1 The Java Virtual Machine About the Spec Format This document describes the Java virtual machine and the instruction set. In this introduction, each component of the machine is briefly described. This
More information(Refer Slide Time: 01:52)
Software Engineering Prof. N. L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture - 2 Introduction to Software Engineering Challenges, Process Models etc (Part 2) This
More informationPUBLIC HEALTH OPTOMETRY ECONOMICS. Kevin D. Frick, PhD
Chapter Overview PUBLIC HEALTH OPTOMETRY ECONOMICS Kevin D. Frick, PhD This chapter on public health optometry economics describes the positive and normative uses of economic science. The terms positive
More informationOperating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University
Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University File Management File management system has traditionally been considered part of the operating system. Applications
More informationAdvanced Tutorials. Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD
Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD ABSTRACT Understanding how SAS stores and displays numeric data is essential
More informationIS YOUR DATA WAREHOUSE SUCCESSFUL? Developing a Data Warehouse Process that responds to the needs of the Enterprise.
IS YOUR DATA WAREHOUSE SUCCESSFUL? Developing a Data Warehouse Process that responds to the needs of the Enterprise. Peter R. Welbrock Smith-Hanley Consulting Group Philadelphia, PA ABSTRACT Developing
More informationAN INTRODUCTION TO PREMIUM TREND
AN INTRODUCTION TO PREMIUM TREND Burt D. Jones * February, 2002 Acknowledgement I would like to acknowledge the valuable assistance of Catherine Taylor, who was instrumental in the development of this
More informationRecommendations for Performance Benchmarking
Recommendations for Performance Benchmarking Shikhar Puri Abstract Performance benchmarking of applications is increasingly becoming essential before deployment. This paper covers recommendations and best
More informationAutomatic Inventory Control: A Neural Network Approach. Nicholas Hall
Automatic Inventory Control: A Neural Network Approach Nicholas Hall ECE 539 12/18/2003 TABLE OF CONTENTS INTRODUCTION...3 CHALLENGES...4 APPROACH...6 EXAMPLES...11 EXPERIMENTS... 13 RESULTS... 15 CONCLUSION...
More informationPROJECT RISK MANAGEMENT
11 PROJECT RISK MANAGEMENT Project Risk Management includes the processes concerned with identifying, analyzing, and responding to project risk. It includes maximizing the results of positive events and
More informationJava's garbage-collected heap
Sponsored by: This story appeared on JavaWorld at http://www.javaworld.com/javaworld/jw-08-1996/jw-08-gc.html Java's garbage-collected heap An introduction to the garbage-collected heap of the Java
More informationWHITE PAPER. Dedupe-Centric Storage. Hugo Patterson, Chief Architect, Data Domain. Storage. Deduplication. September 2007
WHITE PAPER Dedupe-Centric Storage Hugo Patterson, Chief Architect, Data Domain Deduplication Storage September 2007 w w w. d a t a d o m a i n. c o m - 2 0 0 7 1 DATA DOMAIN I Contents INTRODUCTION................................
More informationOPTIMUS SBR. Optimizing Results with Business Intelligence Governance CHOICE TOOLS. PRECISION AIM. BOLD ATTITUDE.
OPTIMUS SBR CHOICE TOOLS. PRECISION AIM. BOLD ATTITUDE. Optimizing Results with Business Intelligence Governance This paper investigates the importance of establishing a robust Business Intelligence (BI)
More informationHow to Write a Successful PhD Dissertation Proposal
How to Write a Successful PhD Dissertation Proposal Before considering the "how", we should probably spend a few minutes on the "why." The obvious things certainly apply; i.e.: 1. to develop a roadmap
More informationMeasuring the Performance of an Agent
25 Measuring the Performance of an Agent The rational agent that we are aiming at should be successful in the task it is performing To assess the success we need to have a performance measure What is rational
More information8. KNOWLEDGE BASED SYSTEMS IN MANUFACTURING SIMULATION
- 1-8. KNOWLEDGE BASED SYSTEMS IN MANUFACTURING SIMULATION 8.1 Introduction 8.1.1 Summary introduction The first part of this section gives a brief overview of some of the different uses of expert systems
More informationMultimedia Caching Strategies for Heterogeneous Application and Server Environments
Multimedia Tools and Applications 4, 279 312 (1997) c 1997 Kluwer Academic Publishers. Manufactured in The Netherlands. Multimedia Caching Strategies for Heterogeneous Application and Server Environments
More informationMemory Allocation Technique for Segregated Free List Based on Genetic Algorithm
Journal of Al-Nahrain University Vol.15 (2), June, 2012, pp.161-168 Science Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Manal F. Younis Computer Department, College
More informationChapter 24 - Quality Management. Lecture 1. Chapter 24 Quality management
Chapter 24 - Quality Management Lecture 1 1 Topics covered Software quality Software standards Reviews and inspections Software measurement and metrics 2 Software quality management Concerned with ensuring
More informationCONTENT STORE SURVIVAL GUIDE
REVISED EDITION CONTENT STORE SURVIVAL GUIDE THE COMPLETE MANUAL TO SURVIVE AND MANAGE THE IBM COGNOS CONTENT STORE CONTENT STORE SURVIVAL GUIDE 2 of 24 Table of Contents EXECUTIVE SUMMARY THE ROLE OF
More informationThe Mathematics of Alcoholics Anonymous
The Mathematics of Alcoholics Anonymous "As a celebrated American statesman put it, 'Let's look at the record. Bill Wilson, Alcoholics Anonymous, page 50, A.A.W.S. Inc., 2001. Part 2: A.A. membership surveys
More informationBENCHMARKING PERFORMANCE AND EFFICIENCY OF YOUR BILLING PROCESS WHERE TO BEGIN
BENCHMARKING PERFORMANCE AND EFFICIENCY OF YOUR BILLING PROCESS WHERE TO BEGIN There have been few if any meaningful benchmark analyses available for revenue cycle management performance. Today that has
More informationACH 1.1 : A Tool for Analyzing Competing Hypotheses Technical Description for Version 1.1
ACH 1.1 : A Tool for Analyzing Competing Hypotheses Technical Description for Version 1.1 By PARC AI 3 Team with Richards Heuer Lance Good, Jeff Shrager, Mark Stefik, Peter Pirolli, & Stuart Card ACH 1.1
More informationUnderstanding Linux on z/vm Steal Time
Understanding Linux on z/vm Steal Time June 2014 Rob van der Heij rvdheij@velocitysoftware.com Summary Ever since Linux distributions started to report steal time in various tools, it has been causing
More informationThe Phases of an Object-Oriented Application
The Phases of an Object-Oriented Application Reprinted from the Feb 1992 issue of The Smalltalk Report Vol. 1, No. 5 By: Rebecca J. Wirfs-Brock There is never enough time to get it absolutely, perfectly
More informationSimulating the Structural Evolution of Software
Simulating the Structural Evolution of Software Benjamin Stopford 1, Steve Counsell 2 1 School of Computer Science and Information Systems, Birkbeck, University of London 2 School of Information Systems,
More informationThe data centre in 2020
INSIDE TRACK Analyst commentary with a real-world edge The data centre in 2020 Dream the impossible dream! By Tony Lock, January 2013 Originally published on http://www.theregister.co.uk/ There has never
More informationCompetitive Analysis of QoS Networks
Competitive Analysis of QoS Networks What is QoS? The art of performance analysis What is competitive analysis? Example: Scheduling with deadlines Example: Smoothing real-time streams Example: Overflow
More informationAbstraction in Computer Science & Software Engineering: A Pedagogical Perspective
Orit Hazzan's Column Abstraction in Computer Science & Software Engineering: A Pedagogical Perspective This column is coauthored with Jeff Kramer, Department of Computing, Imperial College, London ABSTRACT
More informationOptimal Load Balancing in a Beowulf Cluster. Daniel Alan Adams. A Thesis. Submitted to the Faculty WORCESTER POLYTECHNIC INSTITUTE
Optimal Load Balancing in a Beowulf Cluster by Daniel Alan Adams A Thesis Submitted to the Faculty of WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the Degree of Master
More informationTaking the First Steps in. Web Load Testing. Telerik
Taking the First Steps in Web Load Testing Telerik An Introduction Software load testing is generally understood to consist of exercising an application with multiple users to determine its behavior characteristics.
More informationCredit Card Market Study Interim Report: Annex 4 Switching Analysis
MS14/6.2: Annex 4 Market Study Interim Report: Annex 4 November 2015 This annex describes data analysis we carried out to improve our understanding of switching and shopping around behaviour in the UK
More informationDeployment of express checkout lines at supermarkets
Deployment of express checkout lines at supermarkets Maarten Schimmel Research paper Business Analytics April, 213 Supervisor: René Bekker Faculty of Sciences VU University Amsterdam De Boelelaan 181 181
More informationIdentifying and Managing Project Risk, Second Edition 2008 Tom Kendrick. The PERIL Database
The PERIL Database Good project management is based on experience. Fortunately, the experience and pain need not all be personal; you can also learn from the experience of others, avoiding the aggravation
More informationInterpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters
Interpreters and virtual machines Michel Schinz 2007 03 23 Interpreters Interpreters Why interpreters? An interpreter is a program that executes another program, represented as some kind of data-structure.
More informationæ A collection of interrelated and persistent data èusually referred to as the database èdbèè.
CMPT-354-Han-95.3 Lecture Notes September 10, 1995 Chapter 1 Introduction 1.0 Database Management Systems 1. A database management system èdbmsè, or simply a database system èdbsè, consists of æ A collection
More informationWhitepaper: performance of SqlBulkCopy
We SOLVE COMPLEX PROBLEMS of DATA MODELING and DEVELOP TOOLS and solutions to let business perform best through data analysis Whitepaper: performance of SqlBulkCopy This whitepaper provides an analysis
More informationBroadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
More informationManaging Capacity Using VMware vcenter CapacityIQ TECHNICAL WHITE PAPER
Managing Capacity Using VMware vcenter CapacityIQ TECHNICAL WHITE PAPER Table of Contents Capacity Management Overview.... 3 CapacityIQ Information Collection.... 3 CapacityIQ Performance Metrics.... 4
More informationTexas Success Initiative (TSI) Assessment. Interpreting Your Score
Texas Success Initiative (TSI) Assessment Interpreting Your Score 1 Congratulations on taking the TSI Assessment! The TSI Assessment measures your strengths and weaknesses in mathematics and statistics,
More informationHow to Plan a Successful Load Testing Programme for today s websites
How to Plan a Successful Load Testing Programme for today s websites This guide introduces best practise for load testing to overcome the complexities of today s rich, dynamic websites. It includes 10
More informationOn Benchmarking Popular File Systems
On Benchmarking Popular File Systems Matti Vanninen James Z. Wang Department of Computer Science Clemson University, Clemson, SC 2963 Emails: {mvannin, jzwang}@cs.clemson.edu Abstract In recent years,
More informationIan Stewart on Minesweeper
Ian Stewart on Minesweeper It's not often you can win a million dollars by analysing a computer game, but by a curious conjunction of fate, there's a chance that you might. However, you'll only pick up
More informationHow To Make A Backup System More Efficient
Identifying the Hidden Risk of Data De-duplication: How the HYDRAstor Solution Proactively Solves the Problem October, 2006 Introduction Data de-duplication has recently gained significant industry attention,
More informationChapter 6: The Information Function 129. CHAPTER 7 Test Calibration
Chapter 6: The Information Function 129 CHAPTER 7 Test Calibration 130 Chapter 7: Test Calibration CHAPTER 7 Test Calibration For didactic purposes, all of the preceding chapters have assumed that the
More informationReport to the 79 th Legislature. Use of Credit Information by Insurers in Texas
Report to the 79 th Legislature Use of Credit Information by Insurers in Texas Texas Department of Insurance December 30, 2004 TABLE OF CONTENTS Executive Summary Page 3 Discussion Introduction Page 6
More informationAPPENDIX 1 USER LEVEL IMPLEMENTATION OF PPATPAN IN LINUX SYSTEM
152 APPENDIX 1 USER LEVEL IMPLEMENTATION OF PPATPAN IN LINUX SYSTEM A1.1 INTRODUCTION PPATPAN is implemented in a test bed with five Linux system arranged in a multihop topology. The system is implemented
More information