CODE FACTORING IN GCC ON DIFFERENT INTERMEDIATE LANGUAGES
|
|
|
- Ernest Mitchell
- 10 years ago
- Views:
Transcription
1 Annales Univ. Sci. Budapest., Sect. Comp. 30 (2009) CODE FACTORING IN GCC ON DIFFERENT INTERMEDIATE LANGUAGES Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy (Szeged, Hungary) Abstract. Today as handheld devices (smart phones, PDAs, etc.) are becoming increasingly popular, storage capacity becomes more and more important. One way to increase capacity is to optimize static executables on the device. This resulted that code-size optimization gets bigger attention nowadays and new techniques are observed, like code factoring which is still under research. Although GNU GCC is the most common compiler in the open source community and has many implemented algorithms for code-size optimization, the compiler is still weak in these methods, which can be turned on using the -Os flag. In this article we would like to give an overview on implementation of different code factoring algorithms (local factoring, sequence abstraction, interprocedural abstraction) on the IPA, Tree, Tree SSA and RTL passes of GCC. The correctness of the implementation was checked, and the results were measured on different architectures with GCC s official Code-Size Benchmark Environment (CSiBE) as a real-world system. These results showed that on the ARM architecture we could achieve 61.53% maximum and 2.58% average extra code-size saving compared to the -Os flag of GCC. 1. Introduction GCC (GNU Compiler Collection) [1] is a compiler with a set of front ends and back ends for different languages and architectures. It is part of the GNU project and it is free software distributed by the Free Software Foundation (FSF) under
2 80 Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy the GNU General Public License (GNU GPL) and GNU Lesser General Public License (GNU LGPL). As the official compiler of Linux, BSDs, Mac OS X, Symbian OS and many other operating systems, GCC is the most common and most popular compilation tool used by developers. It supports many architectures and it is widely used for mobile phones and other handheld devices, too. When compiling a software for devices like mobile phones, pocket PC s and routers where the storage capacity is limited, a very important feature of the compiler is being able to provide the smallest binary code when possible. GCC already contains code size reducing algorithms, but since in special cases the amount of saved free space may be very important, further optimization techniques may be very useful and may affect our everyday life. One new technique is code factoring, which is still under research. Developers recognized the power in these methods and nowadays several applications use these algorithms for optimization purposes. One of these real-life applications is called The Squeeze Project maintained by Saumya Debray [2], which was one of the first projects using this technique. Another application is called aipop (Automatic Code Compaction software) which is a commercial program released by the AbsInt Angewandte Informatik GmbH with Functional abstraction (reverse inlining) for common basic blocks feature [3]. This application is an optimizer suite software with support for C16x/ST10, HC08 and ARM architectures and it is used by SIEMENS as well. In this paper, we will give an overview of the code factoring algorithms on the different intermediate representation languages (IL) of GCC. The main idea of this approach was introduced on the GCC Summit in 2004 [4], and since these techniques represented a class of new generation in code size optimization, we thought that the implementation on higher level IL should lead to better results. We tested our implementation and measured the new results on CSiBE [5], which is the official Code-Size Benchmark Environment of GCC containing about 18 projects with roughly 51 MB size of source code. The paper is organized as follows. Section 2 gives an overview of code factoring algorithms and the optimization structure of GCC. Sections 3 and 4 introduce local code motion and procedural abstraction with subsections for corresponding ILs. Finally, Section 5 contains the results we reached with our implementation, and Section 6 presents our conclusion. 2. Overview Code factoring is a class of useful optimization techniques developed especially for code size reduction. These approaches aim to reduce size by restructuring the
3 Code factoring in GCC on different intermediate languages 81 code. One possible factoring method is to do small changes on local parts of the source. This is called local factoring. One other possible way is to abstract parts of the code and separate them in new blocks or functions. This technique is called sequence abstraction or functional abstraction. Both cases can work on different representations and can be specialized for those languages. GCC offers several intermediate languages for optimization methods (Figure 2). Currently, most of the optimizing transformations operate on the Register Transfer Language (RTL) [6], which is a very low level language where the instructions are described one by one. It is so close to assembly that the final assembler output is generated from this level. Before the source code is transformed to this level, it passes higher representation levels, too. First it is transformed to GENERIC form which is a language-independent abstract syntax tree. This tree will be lowered to GIMPLE which is a simplified subset of GENERIC, a restricted form where expressions are broken down into a 3-address form. Figure 1. Optimization passes in GCC. Since many optimizing transformations require higher level information about the source code, that is difficult or in some cases even impossible, to obtain from RTL, GCC developers introduced a new representation level called Tree-SSA [7, 8] based on Static Single Assignment form developed by researchers at IBM in the 1980s [9]. This IL is especially suitable for optimization methods that work on the Control Flow Graph (CFG), which is a graph representation of the program containing all paths that might be traversed during the execution. The nodes of this graph are basic blocks where one block is a straight-line sequence of the code with only one entry point and only one exit. These nodes in the flow graph are connected via directed edges and these edges are used to represent jumps in the control flow. It may be reasonable to run one algorithm on different representations together (e.g. first on SSA level and later on RTL). Due to the different information stored by various ILs, it is possible that after running the algorithm on a higher level it will still find optimizable cases on a lower level as well. For the same reason, we obtained our best results by combining the mentioned refactoring algorithms on all implemented optimization levels.
4 82 Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy Figure 2. Implemented algorithms introduced in this paper. We implemented the presented code factoring algorithms on the Tree-SSA and on the RTL levels as well, and for the sequence abstraction we made an interprocedural version, too. Due to the experimental state of interprocedural analysis in GCC, our implementation is also under research. Figure 2 demonstrates the order of our new passes. 3. Sinking-hoisting The main idea of local factoring (also called local code motion, code hoisting or code sinking) is quite simple. Since it often happens that basic blocks with common predecessors or successors contain the same statements, it might be possible to move these statements to the parent or child blocks. For instance, if the first statements of an if node s then and else cases are identically the same, we can easily move them before the if node. With this moving - called code hoisting - we can avoid unnecessary duplication of source code in the CFG. This idea can be extended to other more complicated cases as not only an if node, but a switch and a source code with strange goto statements can contain identical instructions. Furthermore, it is possible to move the statements from the then or else block after the if node, too. This is called code sinking which is only possible when there are no other statements depending on the moved ones in the same block. For doing this code motion, we must collect basic blocks with common predecessors or common successors (also called same parents or same children) (Figure 3). These basic blocks represent a sibling set and we have to look for common statements inside them. If a statement appears in all the blocks of the sibling set and it is not depending on any previous statement, it is a hoistable statement, or if it has no dependency inside the basic blocks, it is a sinkable statement. When the number of blocks inside the sibling set is bigger than the number of parents (children) it is worth hoisting (sinking) the statement (Figure 4).
5 Code factoring in GCC on different intermediate languages 83 A C B D E I F F F K G J H (a) A CF FB D E I K G J H (b) Figure 3. Basic blocks with multiple common predecessors (a) before and (b) after local factoring. A BC DCE H I FGC A DE FG B C C H I (a) (b) Figure 4. Basic blocks with multiple common successors (a) before and (b) after local factoring. A E F BC A G C CH D D 3 I B E F A D G H C I (a) (b) Figure 5. Basic blocks with multiple common successors but only partially common instructions (a) before and (b) after local factoring. To obtain a better size reduction, we can handle a special case of local factoring when there are identical statements not appearing in all the basic blocks of the sibling set. When the number of blocks counting these statements is quite big and we could sink or hoist these statements, it is still possible to simplify the CFG (Figure 5). For instance, we can create a new block and link it before or after the sibling set depending on the direction of the movement. By building correct edges for this new basic block we can rerun the algorithm on the new sibling set with the same statements, and move the identically same statements into our new basic block. This way the gain may be a bit less because building a new basic block needs some extra cost for the new statements. However, this way we can have a more efficient algorithm in code size optimization.
6 84 Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy 3.1. RTL code motion We first implemented this algorithm on the RTL level of GCC s optimization phases. This internal representation language is a low level representation of the source code, very close to the assembly. Therefore, when we are thinking of movable statements we should think of assembly kind statements where we are working with registers, memory addresses, etc. RTL expressions (RTX, for short) are small instructions classified by expression codes (RTX codes). Each RTX has an expression code and a number of arguments. The number of these small instructions is quite big compared to other representations. Due to this the number of potentially movable instructions is the biggest on this level. Although it has an effect on the compilation time, too, it is not relevant because the instructions must be compared to each other only for local parts of the CFG where we do code factoring. The asymptotic complexity of this algorithm is O(n 2 ) (where n is the number of instructions). The reason for this complexity is the comparison of all instructions one by one in the basic blocks of a sibling set in order to find identical statements. One important flavour of the hoisting-sinking algorithm running on this representation level is the definition for identically same statements. It is evident that equal instructions must have the same RTX code, but because of the low level of RTL, the arguments must be exactly equal, too. To decide if one statement is movable, we have to check its dependencies, too. We simply have to check the statements before the current instruction for hoisting and do the same checking with the instructions after the movable statement until the end of the current basic block for sinking. With this implementation, we could achieve 4.31% code size reduction (compared to -Os ) on a file in CSiBE environment. Compiling unrarlib for i686-elf target we could reach 0.24% average code-size saving. The average result measured on CSiBE was about 0.19% for the same architecture Tree-SSA code motion Tree-SSA [7, 8] is a higher level representation than RTL. This representation is similar to GENERIC (a language independent code representation) but it contains modifications to represent the program so that every time a variable is assigned in the code, a new version of the variable is created. Every variable has an SSA NAME which contains the original name for the variable and an SSA VERSION to store the version number for it. Usually, in control flow branches it is not possible at compile time (only at run time) to decide for a used variable which of its earlier versions will be taken. To handle these situations in SSA form, phi nodes [9] are created to help us follow the lifecycle of the variable.
7 Code factoring in GCC on different intermediate languages 85 A phi node is a special kind of assignment with operands that indicate which assignments to the given variable reach the current join point of the CFG. For instance, consider the use of variable a (from example Figure 6a) with version number 9 and 8 in the then and else cases of an if node. Where a is referenced after the join point of if the corresponding version number is ambigous, and to solve this ambiguity a phi node is created as a new artifitial definition of a with version number 1. The SSA form is not suited for handling non-scalar variable types like structures, unions, arrays and pointers. For instance, for an M [100] [100] array it would be nearly impossible to keep track of different version numbers or to decide whether M [i] [j] and M [l] [k] referes to the same variable or not. To handle these problematic cases, the compiler stores references to the base object for non-scalar variables in virtual operands [7]. For instance, M [i] [j] and M [l] [k] are considered references to M in virtual operands. For this optimization level we should redefine identically same statements because we should not use a the strict definition we used before, where we expected from two equal statements for each argument to be equal, too. Due to the strict definition a x1 = b y1 + c z1 differs from a x2 = c z2 + b y2. However, these kinds of assignments are identical because + operand is commutable and SSA versions are indifferent for our cases. On the SSA level, we should define two statements to be equal when the TREE CODE of the statements are equal, and if their arguments are variables, their SSA NAME operands are equal, too (version number may differ). We require non-variable arguments to be exactly the same, but if the statements are commutable, we check the arguments for different orders as well. For dependency checking, Tree-SSA stores the immediate uses of every statement in a list that we can walk through using an iterator macro. Thanks to this representation, the dependency check is the same as before in the RTL phase. When moving statements from one basic block to another one, we have to pay attention to the phi nodes, the virtual operands and the immediate uses carefully. Usually, a variable of a sinkable assign statement s left hand appears inside a phi node (example Figure 6). In these situations, after copying the statement to the children or parent blocks we must recalculate the phi nodes. After this, in both sinking and hoisting cases, we must walk over the immediate uses of the moved statements and rename the old defined variables to the new definitions. The current implementation has a weakness in moving statements with temporary variables. This problem occurs when an assignment with more than one argument is transformed to the SSA form. The problem is that in the SSA form one assignment statement can contain one operand on the right hand, and when GCC splits a statement with two or more operands, it creates temporary variables which have unique SSA NAME containing a creation id. As we have defined, two variables are equal when their SSA names are equal. Consequently
8 86 Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy int D ; <bb 0>: i f ( a 2 > 100) goto <L0>; e l s e goto <L1>; <L0 > : ; a 9 = b 5 + a 2 ; c 1 0 = a ; a 11 = a 9 c 10 ; goto <bb 3> (<L2 >); <L1 > : ; a 6 = a 2 + b 5 ; c 7 = a ; a 8 = a 6 c 7 ; # a 1 = PHI <a 11 ( 1 ), a 8 (2) >; <L2 > : ; D = a 1 ; return D ; (a) Before code motion int D ; <bb 0>: a 16 = a 2 + b 5 ; i f ( a 2 > 100) goto <L0>; e l s e goto <L1>; <L0 > : ; c 1 0 = a ; goto <bb 3> (<L2 >); <L1 > : ; c 7 = a ; # a 14 = PHI <a 16 ( 1 ), a 16 (2) >; # c 1 5 = PHI <c 1 0 ( 1 ), c 7 (2) >; <L2 > : ; a 1 = a 14 c 15 ; return a 1 ; (b) After code motion Figure 6. An example code for Tree-SSA form with movable statements. these statements will not be recovered as movable statements. Since these kinds of expressions are often used by developers, by solving this problem in the implementation we may get better results in size reduction. On Tree SSA, with this algorithm the best result we could reach was 10.34% code saving on a file compiled to ARM architecture. By compiling unrarlib project in CSiBE for i686-elf target, we could achieve 0.87% extra code saving compared to -Os optimizations and the average code size reduction measured for the same target was about 0.1%. 4. Sequence abstraction Sequence abstraction (also known as procedural abstraction) as opposed to local factoring works with whole single-entry single-exit (SESE) code fragments, not only with single instructions. This technique is based on finding identical regions of code which can be turned into procedures. After creating the new procedure we can simply replace the identical regions with calls to the newly created subroutine. There are well-known existing solutions [10, 11] already, but these approaches can only deal with such code fragments that are either identical or equivalent in some sense or can be transformed with register renaming to an equivalent form. However, these methods fail to find an optimal solution for these cases where an instruction sequence is equivalent to another one, while a third one is only
9 Code factoring in GCC on different intermediate languages 87 identical with its suffix (Figure 7a). The current solutions can solve this situation in two possible ways. One way is to abstract the longest possible sequence into a function and leave the shorter one unabstracted (Figure 7b). The second way is to turn the common instructions in all sequences into a function and create another new function from the remaining common part of the longer sequences (Figure 8c). This way, we should deal with the overhead of the inserted extra call/return code as well. Our approach was to create multiple-entry single-exit (MESE) functions in the cases described above. This way, we allow the abstraction of instruction sequences of differing lengths. The longest possible sequence shall be chosen as the body of the new function, and according to the length of the matching sequences we define the entry points as well. The matching sequence will be replaced with a call to the appropriate entry point of the new function. Figure 8d shows the optimal solution for the problem depicted in Figure 7a. A D E G E A D call E G call B B B B B C F H C ret F H (a) (b) Figure 7. Abstraction of (a) instruction sequences of differing lengths to procedures using the strategy for abstracting only the longest sequence (b). Identical letters denote identical sequences. Sequence abstraction has some performance overhead with the execution of the inserted call and the return code. Moreover, the size overhead of the inserted code must also be taken into account. The abstraction shall only be carried out if the gain resulting from the elimination of duplicates exceeds the loss arising from the insertion of extra instructions Sequence abstraction on RTL Using the RTL representation algorithms one can optimize only one function at a time. Although sequence abstraction is inherently an interprocedural optimization technique, it can be adapted to intraprocedural operation. Instead of creating a new function from the identical code fragments, one representative
10 88 Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy D call E G call A ret call call call 3 B ret 3 C F H (c) D G call call A E call 3 B ret 3 C F H (d) Figure 8. Abstraction of instruction sequences from Figure 7 of differing lengths to procedures using different strategies (c,d). Identical letters denote identical sequences. instance of them has to be retained in the body of the processed function, and all the other occurrences will be replaced by the code transferring control to the retained instance. However, to preserve the semantics of the original program, the point where the control has to be returned after the execution of the retained instance must be remembered somehow, thus the subroutine call/return mechanism has to be mimed. In the current implementation, we use labels to mark the return addresses, registers to store references to them, and jumps on registers to transfer the control back to the callers. Implementing sequence abstraction on the RTL phase has the quite significant benefit of reducing the code size instead of implementing the abstraction on a higher intermediate language. Most of the optimization algorithms are finished while the abstraction is started in the compilation queue. Those very few algorithms which are executed after sequence abstraction do not have or only have a very little impact on code size. So our algorithm still has an observable effect on the output before it is generated by GCC. The current implementation can only deal with identical statements where the registers and the instructions are exactly the same. For further improvements, with some extra cost it might be possible to abstract identical sequences where the registers may differ. This implementation has approximately O(n 2 ) cost of running time. The reason for it is the comparison of possible sequences inside basic blocks with n instructions on the current IL. This cost can be optimized to O(n log n) using hashtables with fingerprints. We already have a version for this optimization as well but further tests are required to validate this implementation. This algorithm brought us maximum 45.69% code saving on a source file
11 Code factoring in GCC on different intermediate languages 89 compiled in CSiBE. Compiling libmspack project of CSiBE for arm-elf target, we achieved 2.39% extra code saving compared to -Os optimizations, and our average result in size reduction measured for the same target was about 1.01% Sequence abstraction on tree As a general rule in compilers the commands in a higher intermediate language representation could describe more architecture dependent instructions than in a lower IL. In our view if the sequence abstraction algorithm can merge similar sequences in a higher IL, it could lead to better code size reduction. In Tree IL, there are less restrictions than in RTL. For instance, we do not have to care about register representations. So, the algorithm is able to find more sequences as good candidates to abstraction, while in RTL we must be sure that all references to registers are the same in every subsequence. Unfortunately, the results do not achieve our expectations. The main problem is that the sequence abstraction algorithm is in a very early stage of compilation passes. Other algorithms followed by abstraction could simply mess up the results. This is supported by the fact that after our pass there, we could achieve even 9.25% (2.5% in average) code size reduction counted in Tree unit. In addition, there are some cases when the abstraction does the merge, but it is not really wanted because one or more sequences are dropped (for example a dead code), or there is a better solution for code optimization. These cases mostly occur when the algorithm tries to merge short sequences. However, for a better performance there are still additional improvements on the current implementation. One of them is to extend the current implementation with the ability to abstract approximately equal sequences as well. The current implementation realizes abstractable sequences where the statements are equal, but for several cases it might be possible to deal with not exactly the same sequences as well. Another possible improvement is to compare temporary variables as well. It is exactly the same problem as the one described before for code motion on the Tree-SSA level. This implementation, similarly to the RTL, has a cost of running time about O(n 2 ) for the same reason. We also have an optimized version for O(n log n) using hashtables with fingerprints, but further tests are required for validation. With this optimization method we could reach maximum 41.60% code-size saving on a source file of CSiBE, and by compiling flex project of the environment for arm-elf target, we achieved 3.33% extra code saving compared to -Os optimizations. The average result in size reduction measured for the same target was about 0.73%.
12 90 Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy 4.3. Procedural abstraction with IPA The main idea of interprocedural analysis (IPA) optimizations is to produce algorithms which work on the entire program, across procedure or even file boundaries. For a long time, the open source GCC had no powerful interprocedural methods because its structure was optimized for compiling functions as units. Nowadays, new IPA framework and passes are introduced by the IPA branch and it is still under heavy development [12]. We have also implemented the interprocedural version of our algorithm. This implementation is very similar to the one we use on Tree with one big difference: we can merge sequences from any functions into a new real function. With this approach we can handle more code size overhead coming from function call API than the other two cases described above. In addition, we are also able to merge the types of sequences which use different variables or addresses, because it is possible to pass the variables as parameters for our newly created function. The procedural abstraction identifies and compares the structure of the sequences to find good candidates to the abstraction, and with this method we are able to merge sequences more efficiently than in the previously discussed implementations. In spite of these advantages, there are disadvantages as well. The IPA is also in a very early stage in compilation passes, so there is a risk that other algorithms are able to optimize the candidates better than procedural abstraction (the same cases as in sequence abstraction on Tree). The other disadvantage is that the function call requires many instructions and estimating its cost is difficult. This means that we can only guess the gain we can reach by a given abstraction, because determining how many assembly instructions are in a higher level statement is not possible. This cost computation problem is our difficulty on the Tree level, too. The guessing will not be as accurate as the calculation of the gain on RTL level, and for several cases where we could save code size this may cause that the algorithm does no abstraction. Our implementation of this algorithm has a cost of running time about O(n log n) already, as the comparision of possible sequences is realized using hashtables. It can be compared to the slower O(n 2 ) implementations for other representations on Table 3. With IPA our best result on a source file of CSiBE was 59.29% code saving compared to -Os. By compiling zlib project of the environment for arm-elf target, we achieved 4.29% average code saving and the measured average result on CSiBE was about 1.03%.
13 Code factoring in GCC on different intermediate languages Experimental evaluation For the implementation, we used the GCC source from the repository of the cfo-branch [13]. This source is a snapshot taken from GCC version on Our algorithms are publicly available in the same repository as well. We used CSiBE v2.1.1 [14] as our testbed, which is an environment developed especially for code-size optimization purposes in GCC. It contains small and commonly used projects like zlib, bzip, parts of Linux kernel, parts of compilers, graphic libraries, etc. CSiBE is a very good testbed for compilers and not only for measuring code size results, but even for validating compilation. Our testing system was a dual Xeon 3.0 Ghz PC with 3 Gbyte memory and a Debian GNU/Linux v3.1 operating system. For doing tests on different architectures we crosscompiled GCC for elf binary targets on common architectures like ARM and SH. For crosscompiling, we used binutils v2.17 and newlib v The results show (on Table 1 and Table 2) that these algorithms are really efficient methods in code size optimization, but on higher level intermediate representation languages further improvements may still be required to get better performance due to the already mentioned problems. For i686-elf target, by running all the implemented algorithms as extension to the -Os flag, we could achieve maximum 57.05% and 2.13% average code-size saving. This code-size saving percentage is calculated by dividing the size of the object (compiled with given flags) with the size of the same object compiled with -Os flag. This value is substraced from 1 and converted to percentage (by multiplying it with 100). i686-elf arm-elf sh-elf flags size saving size saving size saving (byte) (%) (byte) (%) (byte) (%) -Os Os -ftree-lfact -frtl-lfact Os -frtl-lfact Os -ftree-lfact Os -ftree-seqabstr -frtl-seqabstr Os -frtl-seqabstr Os -ftree-seqabstr Os -fipa-procabstr all Table 1. Average code-size saving results. (size is in byte and saving is the size saving correlated to -Os in percentage (%)) Here we have to mention that by compiling CSiBE with only -Os flag using the same version of GCC we used for implementation, we get an optimized code which size is 37.19% smaller than compiling it without optimization methods.
14 92 Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy flags i686-elf arm-elf sh-elf max. saving (%) max. saving (%) max. saving (%) -Os -ftree-lfact -frtl-lfact Os -frtl-lfact Os -ftree-lfact Os -ftree-seqabstr -frtl-seqabstr Os -frtl-seqabstr Os -ftree-seqabstr Os -fipa-procabstr all Table 2. Maximum code-size saving results for CSiBE objects. (saving is the size saving correlated to -Os in percentage (%)) By running all the implemented algorithms together we get a smaller code saving percentage compared to the sum of percentages for individual algorithms. This difference occurs because the algorithms work on the same source tree and the earlier passes may optimize the same cases which would be realized by later methods, too. This is the reason why for i686-elf target, by running local factoring on RTL level and Tree-SSA, we could achieve 0.19% and 0.10% average code saving on CSiBE, while by running both of these algorithms the result was only 0.27%. This difference also proved that the same optimization method on different ILs may realize different optimizable cases and running the same algorithm on more than one IL will lead to better performance. The answer for negative values in the code-saving percentages column (Figure 9) is exactly the same. Unfortunately, even if an algorithm optimizes the source tree, it might be possible that it messes up the input for another one which could do better optimization for the same tree. These differences do not mean that the corresponding algorithms are not effective methods, but these usually mean that later passes could optimize the same input source better. In the table of compilation times (Table 3), two algorithms surpass the others. These are the RTL and Tree sequence abstractions because the current implementation is realized with a running time of about O(n 2 ). Nevertheless, these algorithms can be implemented with O(n log n) length of duration and this can result in a relative growth of compilation time similar to the interprocedural abstraction, which is already developed for O(n log n) running time. We have to note that the sequence abstraction algorithms have an effect on the running time as well, because these methods may add new calls to the CFG. As local factoring does not change the number of executed instructions, these optimizations do not change the execution time of the optimized binaries. We measured on our testing system with CSiBE that the average growth of running time compared to -Os flag was about 0.26% for tree abstraction, 0.18% for interprocedural abstraction and 2.49% for the execution of all the algorithms together.
15 Code factoring in GCC on different intermediate languages 93 Figure 9. Detailed results for selected CSiBE projects. flags i686-elf arm-elf sh-elf absolute relative absolute relative absolute relative -Os Os -ftree-lfact -frtl-lfact Os -frtl-lfact Os -ftree-lfact Os -ftree-seqabstr -frtl-seqabstr Os -frtl-seqabstr Os -ftree-seqabstr Os -fipa-procabstr all Table 3. Compilation time results. (absolute is in second (s) and relative means the multiplication factor of the compilation time with -Os ) 6. Conclusion As a conclusion of the results above, the sequence abstraction algorithms produced the best results for us. However, the percentages show that these algorithms did not work as well on higher level abstraction languages as we expected. The reason for this is that in earlier passes our methods could easily mess up the code for later passes which could better deal with the same optimization cases. Perhaps, these situations should be eliminated by compiling in two passes, where
16 94 Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy in the first pass we recover these problematic situations, and in the second one we optimize only the cases which would not set back later passes. About local code motion we have to notice that these algorithms yielded smaller percentages, but collaborated better with later passes. The reason for this is that these methods do transformations on local parts of the source tree and do not do global changes on it. Another benefit is that local factoring should deal with very small or usually no overhead, and the accuracy of cost computation does not affect the optimization too much. Another conclusion is about the running time of the algorithms. The sequence abstraction brought us the best results, but with a quite slow running time. Since this time affects the compilation, we evolve the deduction that if the duration of the compilation is really important, we suggest using local code motion as a fast and effective algorithm. Otherwise, when compilation time does not matter, we can use all the algorithms together for better results. Finally, as an acknowledgment, this work was realized thanks to the cfobranch of GCC. Developers interested in our work are always welcome to help us improve the code factoring algorithms introduced in this paper. References [1] Free Software Foundation, GCC, GNU Compiler Collection, accessed March, [2] Debray S., Evans W., Bosschere K. D. and Sutter B. D., The Squeeze Project: Executable Code Compression, accessed March, [3] AbsInt (Angevandte Informatik), aipop - Automatic Code Compaction, accessed March, [4] Lóki G., Kiss A., Jász J. and Beszédes A., Code factoring in GCC, Proceedings of the 2004 GCC Developers Summit, 2004, [5] Beszédes A., Ferenc R., Gergely T., Gyimóthy T., Lóki G. and Vidács L., CSiBE benchmark: One year perspective and plans, Proceedings of the 2004 GCC Developers Summit, June 2004, [6] Free Software Foundation, GNU Compiler Collection (GCC) internals, accessed March, [7] Novillo D., Tree SSA a new optimization infrastructure for GCC, Proceedings of the 2003 GCC Developers Summit, 2003,
17 Code factoring in GCC on different intermediate languages 95 [8] Novillo D., Design and implementation of Tree SSA, Proceedings of the 2004 GCC Developers Summit, 2004, [9] Cytron R., Ferrante J., Rosen B. K., Wegman M. N. and Zadeck F. K., Efficiently computing static single assignment form and the control dependence graph, ACM Transactions on Programming Languages and Systems, 13 (4) (1991), [10] Cooper K.D. and McIntosh N., Enhanced code compression for embedded RISC processors, Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation, 1999, [11] Debray S. K., Evans W., Muth R. and de Sutter B., Compiler techniques for code compaction. ACM Transactions on Programming Languages and Systems, 22 (2) (2000), [12] Hubička J., The GCC call graph module, a framework for interprocedural optimization, Proceedings of the 2004 GCC Developers Summit, 2004, [13] Lóki G., cfo-branch - GCC development branch, accessed March, [14] Department of Software Engineering, University of Szeged, GCC Code-Size Benchmark Environment (CSiBE), accessed March, Cs. Nagy, G. Lóki, Á. Beszédes and T. Gyimóthy Department of Software Engineering University of Szeged Dugonics tér 13. H-6720 Szeged, Hungary {ncsaba,loki,beszedes,gyimi}@inf.u-szeged.hu
2) Write in detail the issues in the design of code generator.
COMPUTER SCIENCE AND ENGINEERING VI SEM CSE Principles of Compiler Design Unit-IV Question and answers UNIT IV CODE GENERATION 9 Issues in the design of code generator The target machine Runtime Storage
GENERIC and GIMPLE: A New Tree Representation for Entire Functions
GENERIC and GIMPLE: A New Tree Representation for Entire Functions Jason Merrill Red Hat, Inc. [email protected] 1 Abstract The tree SSA project requires a tree representation of functions for the optimizers
Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters
Interpreters and virtual machines Michel Schinz 2007 03 23 Interpreters Interpreters Why interpreters? An interpreter is a program that executes another program, represented as some kind of data-structure.
Compiler-Assisted Binary Parsing
Compiler-Assisted Binary Parsing Tugrul Ince [email protected] PD Week 2012 26 27 March 2012 Parsing Binary Files Binary analysis is common for o Performance modeling o Computer security o Maintenance
Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3
Storage Structures Unit 4.3 Unit 4.3 - Storage Structures 1 The Physical Store Storage Capacity Medium Transfer Rate Seek Time Main Memory 800 MB/s 500 MB Instant Hard Drive 10 MB/s 120 GB 10 ms CD-ROM
1/20/2016 INTRODUCTION
INTRODUCTION 1 Programming languages have common concepts that are seen in all languages This course will discuss and illustrate these common concepts: Syntax Names Types Semantics Memory Management We
Optimizations. Optimization Safety. Optimization Safety. Control Flow Graphs. Code transformations to improve program
Optimizations Code transformations to improve program Mainly: improve execution time Also: reduce program size Control low Graphs Can be done at high level or low level E.g., constant folding Optimizations
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
Optimal Stack Slot Assignment in GCC
Optimal Stack Slot Assignment in GCC Naveen Sharma Sanjiv Kumar Gupta System Software Group HCL Technologies Ltd Noida, India 201301 {naveens, sanjivg}@noida.hcltech.com Abstract Several microprocessors,
Instruction Set Architecture (ISA)
Instruction Set Architecture (ISA) * Instruction set architecture of a machine fills the semantic gap between the user and the machine. * ISA serves as the starting point for the design of a new machine
Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R
Binary Search Trees A Generic Tree Nodes in a binary search tree ( B-S-T) are of the form P parent Key A Satellite data L R B C D E F G H I J The B-S-T has a root node which is the only node whose parent
CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions
CS 2112 Spring 2014 Assignment 3 Data Structures and Web Filtering Due: March 4, 2014 11:59 PM Implementing spam blacklists and web filters requires matching candidate domain names and URLs very rapidly
ELEC 377. Operating Systems. Week 1 Class 3
Operating Systems Week 1 Class 3 Last Class! Computer System Structure, Controllers! Interrupts & Traps! I/O structure and device queues.! Storage Structure & Caching! Hardware Protection! Dual Mode Operation
Pattern Insight Clone Detection
Pattern Insight Clone Detection TM The fastest, most effective way to discover all similar code segments What is Clone Detection? Pattern Insight Clone Detection is a powerful pattern discovery technology
Persistent Binary Search Trees
Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents
Lecture 2 Introduction to Data Flow Analysis
Lecture 2 Introduction to Data Flow Analysis I. Introduction II. Example: Reaching definition analysis III. Example: Liveness analysis IV. A General Framework (Theory in next lecture) Reading: Chapter
Write Barrier Removal by Static Analysis
Write Barrier Removal by Static Analysis Karen Zee and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Cambridge, MA 02139 {kkz, [email protected] ABSTRACT We present
Replication on Virtual Machines
Replication on Virtual Machines Siggi Cherem CS 717 November 23rd, 2004 Outline 1 Introduction The Java Virtual Machine 2 Napper, Alvisi, Vin - DSN 2003 Introduction JVM as state machine Addressing non-determinism
File Management. Chapter 12
Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution
Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2
Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of
Chapter 12 File Management
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Roadmap Overview File organisation and Access
Chapter 12 File Management. Roadmap
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Overview Roadmap File organisation and Access
Parallelization: Binary Tree Traversal
By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First
Symbol Tables. Introduction
Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The
X86-64 Architecture Guide
X86-64 Architecture Guide For the code-generation project, we shall expose you to a simplified version of the x86-64 platform. Example Consider the following Decaf program: class Program { int foo(int
CSE 504: Compiler Design. Data Flow Analysis
Data Flow Analysis Pradipta De [email protected] Current Topic Iterative Data Flow Analysis LiveOut sets Static Single Assignment (SSA) Form Data Flow Analysis Techniques to reason about runtime
Week 1 out-of-class notes, discussions and sample problems
Week 1 out-of-class notes, discussions and sample problems Although we will primarily concentrate on RISC processors as found in some desktop/laptop computers, here we take a look at the varying types
EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES
ABSTRACT EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES Tyler Cossentine and Ramon Lawrence Department of Computer Science, University of British Columbia Okanagan Kelowna, BC, Canada [email protected]
IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS
Volume 2, No. 3, March 2011 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at www.jgrcs.info IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE
Embedded Software Development
Linköpings Tekniska Högskola Institutionen för Datavetanskap (IDA), Software and Systems (SaS) TDDI11, Embedded Software 2010-04-22 Embedded Software Development Host and Target Machine Typical embedded
Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z) [email protected] http://www.mzahran.com
CSCI-UA.0201-003 Computer Systems Organization Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z) [email protected] http://www.mzahran.com Some slides adapted (and slightly modified)
A Data De-duplication Access Framework for Solid State Drives
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 941-954 (2012) A Data De-duplication Access Framework for Solid State Drives Department of Electronic Engineering National Taiwan University of Science
Using Eclipse CDT/PTP for Static Analysis
PTP User-Developer Workshop Sept 18-20, 2012 Using Eclipse CDT/PTP for Static Analysis Beth R. Tibbitts IBM STG [email protected] "This material is based upon work supported by the Defense Advanced Research
Analysis of Compression Algorithms for Program Data
Analysis of Compression Algorithms for Program Data Matthew Simpson, Clemson University with Dr. Rajeev Barua and Surupa Biswas, University of Maryland 12 August 3 Abstract Insufficient available memory
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM 1 The ARM architecture processors popular in Mobile phone systems 2 ARM Features ARM has 32-bit architecture but supports 16 bit
Information, Entropy, and Coding
Chapter 8 Information, Entropy, and Coding 8. The Need for Data Compression To motivate the material in this chapter, we first consider various data sources and some estimates for the amount of data associated
GameTime: A Toolkit for Timing Analysis of Software
GameTime: A Toolkit for Timing Analysis of Software Sanjit A. Seshia and Jonathan Kotker EECS Department, UC Berkeley {sseshia,jamhoot}@eecs.berkeley.edu Abstract. Timing analysis is a key step in the
Glossary of Object Oriented Terms
Appendix E Glossary of Object Oriented Terms abstract class: A class primarily intended to define an instance, but can not be instantiated without additional methods. abstract data type: An abstraction
SIMULATION OF LOAD BALANCING ALGORITHMS: A Comparative Study
SIMULATION OF LOAD BALANCING ALGORITHMS: A Comparative Study Milan E. Soklic Abstract This article introduces a new load balancing algorithm, called diffusive load balancing, and compares its performance
Using Power to Improve C Programming Education
Using Power to Improve C Programming Education Jonas Skeppstedt Department of Computer Science Lund University Lund, Sweden [email protected] jonasskeppstedt.net jonasskeppstedt.net [email protected]
Using Object And Object-Oriented Technologies for XML-native Database Systems
Using Object And Object-Oriented Technologies for XML-native Database Systems David Toth and Michal Valenta David Toth and Michal Valenta Dept. of Computer Science and Engineering Dept. FEE, of Computer
Machine-Code Generation for Functions
Machine-Code Generation for Functions Cosmin Oancea [email protected] University of Copenhagen December 2012 Structure of a Compiler Programme text Lexical analysis Binary machine code Symbol sequence
File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System
CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID
Deferred node-copying scheme for XQuery processors
Deferred node-copying scheme for XQuery processors Jan Kurš and Jan Vraný Software Engineering Group, FIT ČVUT, Kolejn 550/2, 160 00, Prague, Czech Republic [email protected], [email protected] Abstract.
Physical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
Axivion Bauhaus Suite Technical Factsheet
Axivion Bauhaus Suite Technical Factsheet Inhalt 1 Integrated Solution for Stopping Software Erosion... 2 1.1 Build Process Integration... 2 1.2 Continuous Integration... 2 1.3 Web-Dashboard... 2 1.4 Clone
From Faust to Web Audio: Compiling Faust to JavaScript using Emscripten
From Faust to Web Audio: Compiling Faust to JavaScript using Emscripten Myles Borins Center For Computer Research in Music and Acoustics Stanford University Stanford, California United States, [email protected]
Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings: 9.1-9.
Code Generation I Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings: 9.1-9.7 Stack Machines A simple evaluation model No variables
VIP Quick Reference Card
VIP Quick Reference Card Loading VIP (Based on VIP 3.5 in GNU Emacs 18) Just type M-x vip-mode followed by RET VIP Modes VIP has three modes: emacs mode, vi mode and insert mode. Mode line tells you which
Scalability and Classifications
Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static
10CS35: Data Structures Using C
CS35: Data Structures Using C QUESTION BANK REVIEW OF STRUCTURES AND POINTERS, INTRODUCTION TO SPECIAL FEATURES OF C OBJECTIVE: Learn : Usage of structures, unions - a conventional tool for handling a
CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study
CS 377: Operating Systems Lecture 25 - Linux Case Study Guest Lecturer: Tim Wood Outline Linux History Design Principles System Overview Process Scheduling Memory Management File Systems A review of what
Data Structure Reverse Engineering
Data Structure Reverse Engineering Digging for Data Structures Polymorphic Software with DSLR Scott Hand October 28 th, 2011 Outline 1 Digging for Data Structures Motivations Introduction Laika Details
Review; questions Discussion of Semester Project Arbitrary interprocedural control flow Assign (see Schedule for links)
Class 9 Review; questions Discussion of Semester Project Arbitrary interprocedural control flow Assign (see Schedule for links) Readings on pointer analysis Problem Set 5: due 9/22/09 Project proposal
Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz 2007 03 16
Advanced compiler construction Michel Schinz 2007 03 16 General course information Teacher & assistant Course goals Teacher: Michel Schinz [email protected] Assistant: Iulian Dragos INR 321, 368 64
An Introduction To Simple Scheduling (Primarily targeted at Arduino Platform)
An Introduction To Simple Scheduling (Primarily targeted at Arduino Platform) I'm late I'm late For a very important date. No time to say "Hello, Goodbye". I'm late, I'm late, I'm late. (White Rabbit in
File System Management
Lecture 7: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation
Original-page small file oriented EXT3 file storage system
Original-page small file oriented EXT3 file storage system Zhang Weizhe, Hui He, Zhang Qizhen School of Computer Science and Technology, Harbin Institute of Technology, Harbin E-mail: [email protected]
Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs)
Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs) 1. Foreword Magento is a PHP/Zend application which intensively uses the CPU. Since version 1.1.6, each new version includes some
Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware
Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware Mahyar Shahsavari, Zaid Al-Ars, Koen Bertels,1, Computer Engineering Group, Software & Computer Technology
Freescale Semiconductor, I
nc. Application Note 6/2002 8-Bit Software Development Kit By Jiri Ryba Introduction 8-Bit SDK Overview This application note describes the features and advantages of the 8-bit SDK (software development
Parsing Technology and its role in Legacy Modernization. A Metaware White Paper
Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks
A3 Computer Architecture
A3 Computer Architecture Engineering Science 3rd year A3 Lectures Prof David Murray [email protected] www.robots.ox.ac.uk/ dwm/courses/3co Michaelmas 2000 1 / 1 6. Stacks, Subroutines, and Memory
Building and Using a Cross Development Tool Chain
Building and Using a Cross Development Tool Chain Robert Schiele [email protected] Abstract 1 Motivation 1.1 Unix Standard System Installations When building ready-to-run applications from source,
Input Output. 9.1 Why an IO Module is Needed? Chapter 9
Chapter 9 Input Output In general most of the finite element applications have to communicate with pre and post processors,except in some special cases in which the application generates its own input.
Binary Trees and Huffman Encoding Binary Search Trees
Binary Trees and Huffman Encoding Binary Search Trees Computer Science E119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary
Why? A central concept in Computer Science. Algorithms are ubiquitous.
Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online
ProTrack: A Simple Provenance-tracking Filesystem
ProTrack: A Simple Provenance-tracking Filesystem Somak Das Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology [email protected] Abstract Provenance describes a file
Introduction to Data-flow analysis
Introduction to Data-flow analysis Last Time LULESH intro Typed, 3-address code Basic blocks and control flow graphs LLVM Pass architecture Data dependencies, DU chains, and SSA Today CFG and SSA example
An Embedded Wireless Mini-Server with Database Support
An Embedded Wireless Mini-Server with Database Support Hungchi Chang, Sy-Yen Kuo and Yennun Huang Department of Electrical Engineering National Taiwan University Taipei, Taiwan, R.O.C. Abstract Due to
GraySort and MinuteSort at Yahoo on Hadoop 0.23
GraySort and at Yahoo on Hadoop.23 Thomas Graves Yahoo! May, 213 The Apache Hadoop[1] software library is an open source framework that allows for the distributed processing of large data sets across clusters
2 SYSTEM DESCRIPTION TECHNIQUES
2 SYSTEM DESCRIPTION TECHNIQUES 2.1 INTRODUCTION Graphical representation of any process is always better and more meaningful than its representation in words. Moreover, it is very difficult to arrange
Analyze Database Optimization Techniques
IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 275 Analyze Database Optimization Techniques Syedur Rahman 1, A. M. Ahsan Feroz 2, Md. Kamruzzaman 3 and
Lecture 9. Semantic Analysis Scoping and Symbol Table
Lecture 9. Semantic Analysis Scoping and Symbol Table Wei Le 2015.10 Outline Semantic analysis Scoping The Role of Symbol Table Implementing a Symbol Table Semantic Analysis Parser builds abstract syntax
Peter Mileff PhD SOFTWARE ENGINEERING. The Basics of Software Engineering. University of Miskolc Department of Information Technology
Peter Mileff PhD SOFTWARE ENGINEERING The Basics of Software Engineering University of Miskolc Department of Information Technology Introduction Péter Mileff - Department of Information Engineering Room
Binary Heap Algorithms
CS Data Structures and Algorithms Lecture Slides Wednesday, April 5, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks [email protected] 2005 2009 Glenn G. Chappell
The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,
The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge, cheapest first, we had to determine whether its two endpoints
Introduction to Embedded Systems. Software Update Problem
Introduction to Embedded Systems CS/ECE 6780/5780 Al Davis logistics minor Today s topics: more software development issues 1 CS 5780 Software Update Problem Lab machines work let us know if they don t
COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters
COMP5426 Parallel and Distributed Computing Distributed Systems: Client/Server and Clusters Client/Server Computing Client Client machines are generally single-user workstations providing a user-friendly
Operating Systems. Virtual Memory
Operating Systems Virtual Memory Virtual Memory Topics. Memory Hierarchy. Why Virtual Memory. Virtual Memory Issues. Virtual Memory Solutions. Locality of Reference. Virtual Memory with Segmentation. Page
C Compiler Targeting the Java Virtual Machine
C Compiler Targeting the Java Virtual Machine Jack Pien Senior Honors Thesis (Advisor: Javed A. Aslam) Dartmouth College Computer Science Technical Report PCS-TR98-334 May 30, 1998 Abstract One of the
Cloud Computing. Up until now
Cloud Computing Lecture 11 Virtualization 2011-2012 Up until now Introduction. Definition of Cloud Computing Grid Computing Content Distribution Networks Map Reduce Cycle-Sharing 1 Process Virtual Machines
5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.
1. The advantage of.. is that they solve the problem if sequential storage representation. But disadvantage in that is they are sequential lists. [A] Lists [B] Linked Lists [A] Trees [A] Queues 2. The
Computer Architectures
Computer Architectures 2. Instruction Set Architectures 2015. február 12. Budapest Gábor Horváth associate professor BUTE Dept. of Networked Systems and Services [email protected] 2 Instruction set architectures
SourceMeter SonarQube plug-in
2014 14th IEEE International Working Conference on Source Code Analysis and Manipulation SourceMeter SonarQube plug-in Rudolf Ferenc, László Langó, István Siket, Tibor Gyimóthy University of Szeged, Department
Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows
TECHNISCHE UNIVERSITEIT EINDHOVEN Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents
Good FORTRAN Programs
Good FORTRAN Programs Nick West Postgraduate Computing Lectures Good Fortran 1 What is a Good FORTRAN Program? It Works May be ~ impossible to prove e.g. Operating system. Robust Can handle bad data e.g.
Chapter 5. Regression Testing of Web-Components
Chapter 5 Regression Testing of Web-Components With emergence of services and information over the internet and intranet, Web sites have become complex. Web components and their underlying parts are evolving
Chapter 5 Linux Load Balancing Mechanisms
Chapter 5 Linux Load Balancing Mechanisms Load balancing mechanisms in multiprocessor systems have two compatible objectives. One is to prevent processors from being idle while others processors still
Data Structure with C
Subject: Data Structure with C Topic : Tree Tree A tree is a set of nodes that either:is empty or has a designated node, called the root, from which hierarchically descend zero or more subtrees, which
StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud
StACC: St Andrews Cloud Computing Co laboratory A Performance Comparison of Clouds Amazon EC2 and Ubuntu Enterprise Cloud Jonathan S Ward StACC (pronounced like 'stack') is a research collaboration launched
General Flow-Sensitive Pointer Analysis and Call Graph Construction
General Flow-Sensitive Pointer Analysis and Call Graph Construction Endre Horváth, István Forgács, Ákos Kiss, Judit Jász and Tibor Gyimóthy University of Szeged 4D Soft Ltd. Aradi Vértanúk tere 1. Soroksári
Mining a Change-Based Software Repository
Mining a Change-Based Software Repository Romain Robbes Faculty of Informatics University of Lugano, Switzerland 1 Introduction The nature of information found in software repositories determines what
