Recovery of Jump Table Case Statements from Binary Code
|
|
|
- Marilyn McDaniel
- 10 years ago
- Views:
Transcription
1 Recovery of Jump Table Case Statements from Binary Code Cristina Cifuentes Mike Van Emmerik Department of Computer Science and Electrical Engineering The University of Queensland Brisbane Qld 4072, Australia Technical Report 444 December 1998 Abstract One of the fundamental problems with the analysis of binary (executable) code is that of recognizing, in a machine-independent way, the target addresses of n-conditional branches implemented via a jump table. Without these addresses, the decoding of the machine instructions for a given procedure is incomplete, as well as any analysis on that procedure. In this paper we present a technique for recovering jump tables and their target addresses in a machine and compiler independent way. The technique is based on slicing and expression substitution. The assembly code of a procedure that contains an indexed jump is transformed into a normal form which allows us to determine where the jump table is located and what information it contains (e.g. offsets from the table or absolute addresses). The presented technique has been tested on SPARC and Pentium code generated by C, C++, Fortran and Pascal compilers. Our tests show that up to 90% more of the code in a text segment can be found by using this technique. The technique was developed as part of our retargetable binary translation framework; however, it is also suitable for other binary-manipulation tools such as binary profilers, instrumentors and decompilers. This work is sponsored by the Australian Research Council under grant No. A and Sun Microsystems. 1. Introduction The ever growing reliance on software and the continued development of newer and faster machines has increased the need for machine-code tools that aid in the migration, emulation, debugging, tracing, and profiling of legacy code. Amongst these tools are: binary translators [21, 22], emulators/simulators [11, 3], code instrumentors [4, 19, 23], disassemblers [12, 2], and decompilers [8, 15]. The fundamental problem in decoding machine instructions is that of distinguishing code from data both are represented in the same way in von Newmann machines. Given an executable program, the entry point to that program is given in the program s header. Information on text and data sections is also given in the header. However, data can also be stored in text areas, without such information being stored in the executable program s header. Hence the need to analyze the code that is decoded from the text area(s) of the program, and separate data from code. The standard method of decoding machine code involves following all reachable paths from the entry point [21, 8]. This method does not give a complete coverage of the text space in the presence of indirect transfers of control such as indexed jumps and indirect calls. A common technique used to overcome this problem is the use of patterns. A pattern is generated for a particular compiler to cater for the way in which the compiler, or family of compilers, generate code for an indexed jump. This technique is extensively used
2 as most tools deal with a particular set of compilers; for example, TracePoint only processes Windows binaries generated by the Microsoft C++ compiler [23]. In the presence of optimized code, patterns do not tend to work very effectively, even when the code is generated by a compiler known to the pattern recognizer. In this paper we present a technique to recover the targets of indexed jumps on a variety of machines and languages. This technique is based on slicing of binary code [7] and expression substitution into a normal form. Section 3 presents several Pentium and SPARC examples of code that use indexed jumps, Section 4 explains our normalization technique, Section 5 provides results on the use of this technique in existing benchmarks programs, and Section 6 gives previous work in this area. The paper contains an appendix with extra interesting examples. 2. Compiler Code Generation for N- Conditional Branches N-conditional branches were first suggested by Wirth and Hoare in 1966 [24, 25] as a useful extension to the Algol language. An n-conditional branch allows a programming language to determine one of branches in the code. This extension was implemented in Algol 68 in a form that allowed its use as a statement or an expression. In other words, the result of the case could be assigned to a variable. This high-level statement has evolved to the well known switch statement in C or the case statement in Pascal, where labels are used for the different arms of the conditional branch, and a default arm is allowed, as per Figure 1. The C code shows the indexed variable num which is tested against the values in the range 2 to 7 for individual actions, and if not successful, defaults to the last default action. Although not commonly documented in compiler textbooks, compiler writers generate different types of machine code for n-conditional branches. These ways of generating n-conditional branches are determined by talking to compiler writers or reverse engineering executable code. Several techniques for generating n-conditoinal branches from a compiler were documented in the 70s and 80s, where optimization for space and speed was an important issue. The most common techniques are described here based on [20]. #include <stdio.h> int main() { int num; printf("input a number, please: "); scanf("%d", &num); switch(num) { case 2: printf("two!\n"); break; case 7: printf("seven!\n"); break; default: printf("other!\n"); break; } return 0; } Figure 1. Sample switch program written in the C language. The simplest way of generating code for an n- conditional branch is as a linear sequence of comparisons against each arm in the statement. This form is efficient for a small number of arms, typically 4 or less. A more sophisticated technique is the if-tree, where the selection is accomplished by a nested set of comparisons organized into a tree. The most common implementation is a jump table, which may hold labels or offsets from a particular label. This implementation requires a range test to determine the membership of values on the table. Although jump tables are the fastest method when there are many arms in the n-conditional branch, jump tables are space-wise inefficient if the case values are sparse. In such cases, a search tree is the most convenient implementation. When the arms of the n-conditional branch are sparse but yet can be clustered in ranges, a common technique used is to combine search trees and jump tables to implement each cluster of values [16, 13]. This paper deals with the issue of recovering code from generated jump tables, in such a way that the target addresses of an indexed jump are determined. This paper does not attempt to recover high-level case statements, but rather the information necessary to translate an indirect branch indexing a jump table. For an n-conditional branch implememented using a jump table, an indexed table is set up with addresses or offsets for each of the cases of the branch. The table itself is located in a read-only data section, or mixed in with the text section. In the interest of efficiency, range 2
3 tests for such jump tables need to be concise. The most common way of doing both tests is as follows [5]: k <- case_selector - lower_bound compare k with (upper_bound - lower_bound) if unsigned_greater goto out_of_range assertion: lower_bound <= case_selector <= upper_bound If the case selector value is within the bounds of the upper and lower bounds, an offset into the jump table is calculated based on the size of each entry in the table; typically 4 bytes for a 32-bit machine. Based on the addressing modes available to a machine, either an indirect jump on the address of the table plus the offset, or an indexed jump on the same values is generated. The machine then continues execution at the target of the indirect/indexed jump. Retargetable compilers also use these techniques. A brief description for the code generation of an indirect jump through a branch table for a retargetable C compiler is given in [14] in the following specification: if t1 < v[l] goto lolab if t1 > v[u] goto hilab goto *table[t1-v[l]] ; l=lower bound ; u=upper bound Overall, compiler writers use a variety of heuristics to determine which code to generate for a given n-conditional branch based on the addressing modes and instructions available on the target machine. It is also common for a compiler to have more than one way of emitting code for such a construct, based on the number of arms in the conditional branch and the sparseness of the values in such arms. 3. Examples of Existing Indexed Jumps in Binary Code This section presents examples of Pentium and SPARC code that make use of indexed jump tables. The examples aim to familiarize the reader with a variety of ways of encoding an n-conditional branch in assembly code, as well as to show the degree of complexity of such code. The assembly code for the examples was generated by the Unix utility dis. This disassembler uses the convention of placing the destination operand on the right of the instruction. The examples show annotated native assembly code, and where relevant, the address for the assembly instructions or the indexed table. movl -8(%ebp),%eax! Read index variable subl $0x2,%eax! Minus lower bound cmpl $0x5,%eax! Check upper bound ja 0xffffffd9 <80489dc>! Exit; out of range jmp *0x8048a0c(,%eax,4)! Indexed, scaled jump 8048a0c: ! Table of addresses 8048a10: ! to code handling 8048a14: 8c ! the various switch 8048a18: a ! cases Figure 2. Pentium assembly code for simple switch program, produced by the cc compiler. ld [%fp - 20], %o0! Read index variable add %o0, -2, %o1! Minus lower bound cmp %o1, 5! Check upper bound bgu 0x10980! Exit if out of range sethi %hi(0x10800), %o0! Set table address or %o0, 0x108, %o0! (continued) sll %o1, 2, %o1! Multiply by 4 ld [%o0 + %o1], %o0! Fetch from table jmp %o0! Jump nop 10908: 0x1091c! Table of pointers 1090c: 0x : 0x : 0x10958 Figure 3. SPARC assembly code for simple switch program, produced by the cc compiler. The first two examples in Figures 2 and 3 were generated by the cc compiler on a Solaris Pentium and SPARC machine respectively, from the sample pro- 3
4 gram in Figure 1. In Figure 2, register eax is used as the index variable; its value is read from a local variable on the stack ([ebp-8]). The lower bound and the range of the table are checked (2 and 5 respectively); the code exits if the value of the index variable is out of bounds. If within bounds, an indexed scaled jump on (eax*4) is performed, offset from the start of the indexed table at 0x8048a0c. The contents of the values of the table are of addresses; each is displayed in little-endian format. Figure 3 performs the same logical steps as Figure 2 using SPARC assembly code, where indexed jumps do not exist but indirect jumps on registers are allowed. In the example, the indexed variable is initially in o0, which gets set from a local variable on the stack ([fp- 20]). The lower bound is computed and the indexed variable is set to o1. The range of the table is checked; if out of bounds, the code exits to address 0x If within bounds, the address of the table is computed to o0 (by the sethi and or instructions), the indexed register is multiplied by 4 to get the right 4-byte offset into the indexed table, and the value of the table (o0) indexed at o1 is fetched into o0. A jump to o0 is then performed. Figure 4 presents a SPARC example that uses a hash function to determine how to index into the table. The code comes from the Solaris 2.5 vi program. The index variable is set as o0, and it is normalized by subtracting its lower bound. The range of the table is checked; if the value is out of range, a jump to the end of the case statement is performed (0x18804). If within bounds, the table s address is set in register o2. The indexed register is hashed into o1 and multiplied by 8 (into o4) to get the right offset into the table (as the table contains two 4-byte entries per case). A word is loaded from the table into register o3 and its value is compared against the hash function key (the normalized index variable o0). If the value matches, the code jumps to address 0x1885c, where a second word is read from the table into o0, and a register jump is performed to that address. In the case where the value fetched from the table does not match the key, an endof-hashing comparison is performed against the value -1. If -1 is found, the code exits (0x18804), otherwise, the indexed register (o4) is set to point to the next value in the table (wrapping the offset into the table from the end of the table to the start) and the process is repeated at address 0x Note that this table contains 2 entries per case; the first one is the normalized index value, and the second one is the target address for the code associated with that case entry. 8057d90: movb 38(%eax),%al 8057d93: testb $0x2,%al 8057d95: setne %edx 8057d98: andl $0xff,%edx 8057d9e: testb $0x4,%al 8057da0: setne %ecx 8057da3: andl $0xff,%ecx 8057da9: testb $0x8,%al 8057dab: setne %eax 8057dae: andl $0xff,%eax 8057db3: shll $0x2,%edx 8057db6: shll %ecx 8057db8: orl %edx,%ecx 8057dba: orl %eax,%ecx 8057dbc: cmpl $0x7,%ecx 8057dbf: jbe 0x280 < > : jmp * (,%ecx,4) 805f5e8: f8 7d 05 08! Table of addresses 805f5ec: ! of code to handle! switch cases Figure 5. Pentium assembly code from the m88ksim program, produced by Sun cc version 4.2 compiler. Our last example, Figure 5, is from the m88ksim spec95 benchmark suite. This example shows 3 groups of tests on bits of a field within a structure, which get stored in registers edx, ecx and eax. The three partial results are then or d together to get the resultant indexed variable in register ecx. The upper bound is checked (7) and, if within bounds, a branch to address 0x is taken, where an indexed branch is made on the contents of register ecx, offset by the right amount (4), and the table address. Note that the branch (jbe) is the opposite of that normally found in switch statements (i.e. ja). This illustrates the danger of relying on patterns of instruction to recover indexed branch targets. The appendix illustrates two more examples. 4
5 18524: ld [%fp - 20], %o0! Read indexed variable 18528: sub %o0, 67, %o0! Subtract lower bound 1852c: cmp %o0, 53! Compare with range : sethi %hi(0x18800), %o2! Set upper table addr 18534: bgu 0x18804! Exit if out of range 18538: nop 1853c: or %o2, 108, %o2! Set lower table addr 18540: srl %o0, 4, %o1! Hash 18544: sll %o1, 1, %o1! 18548: add %o1, %o0, %o1! function 1854c: and %o1, 15, %o1! modulo : sll %o1, 3, %o4! Mult by : ld [%o4 + %o2], %o3! First entry in table 18558: cmp %o3, %o0! Compare keys 1855c: be 0x1885c! Branch if matched 18560: cmp %o3, -1! Unused entry? 18564: be 0x18804! Yes, exit 18568: nop! (delay slot) 1856c: add %o4, 8, %o4! No, linear probe 18570: and %o4, 120, %o4! with wraparound 18574: ba 0x18554! Continue lookup 18578: nop 1885c: add %o4, %o2, %o4! Point to first entry 18860: ld [%o4 + 4], %o0! Load second entry 18864: jmp %o0! Jump there 18868: nop! Each entry is a (key value, code address) pair 1886c: 0x : 0x187b4! Case C +0! Unused entries have -1 (i.e. 0xffffff) as the first entry 18874: 0xffffffff! Rest of the table 18878: 0x c: 0x : 0x185b8! Case C +0x10 = S 18884: 0x2f 18888: 0x18630! Case C +0x2f = r Figure 4. SPARC assembly code from the vi program, produced by the Sun cc version compiler. 4. Our Technique We have developed a technique to recover n- conditional branches from disassembled code. The technique is architecture, compiler and language independent, and has been tested on CISC and RISC machines with a variety of languages and compilers (or unknown compiler, when dealing with precompiled executables). Development of general techniques is an aim in our work as analysis of executable code should not rely on particular compiler knowledge; this knowledge prevents the techniques from working with code generated by other compilers, and in most cases, for other machines. There are 3 steps to our technique: 1. Slice the code at the indexed/indirect register jump, 2. Perform expression substitution to recover pseudo high-level statements, and 3. Check against indexed branch normal forms. 5
6 6 ) 4.1. Slicing of Binary Code Our executable code analysis framework allows for the disassembly of the code into an intermediate representation composed of register transfer lists (RTL) [9] and control flow graphs for each decoded procedure in the program. The RTL describes the effects of machine instructions in terms of register transfers, and is general enough to support RISC and CISC machine descriptions. When an indexed or indirect jump is decoded, we create an intraprocedural backward slice of the disassembled binary code [7]. Slicing occurs by following the transitive closure of registers and condition codes that are used in a given expression. The stop criterion for a given register along a path is when that register is loaded from memory (i.e. from a local variable, a procedure argument, or a global variable), it is returned by another function, or it reaches the start of the procedure without being defined (and hence it is a register set by the caller). For the purposes of determining indexed jump tables, we have an extra stop criterion: if the lower bound of the indexed jump is found, and other relevant information has been found, no more slicing is performed. Of course, this condition is not always satisfied as indexed tables starting at 0 do not need to check for the lower bound. In such cases, the slice finishes by means of the other stop conditions. In the case of slices across calls, we stop if the register is returned by the call (i.e. eax on Pentium or o0 on SPARC); in other cases we assume registers are preserved across calls and continue slicing. This is a heuristic that works well in practice and is needed in only a few cases. The heuristic works when the machine code conforms to the operating system s application binary interface [1]. For example, for Figure 2, the following slice is created using RTL notation: (1) eax = m[ebp-8] (2) eax = eax - 2 (3) ZF = (eax - 5) = 0? 1 : 0 (4) CF = ( eax@31 & 5@31) ((eax-5)@31 & ( eax@31 5@31)) (5) PC = ( CF & ZF) = 1? 0x80489dc : PC (6) PC = m[0x8048a0c + eax * 4] Register ebp points to the stack, hence indexed variable eax gets a value from the local memory for that procedure. The indexed register is normalized by subtracting the lower bound (2) and its range is checked against 5 (the difference between the upper and lower bounds). If within bounds, an indexed jump is performed Expression Substitution Once a slice has been computed, we perform expression substitution by means of forward substitution. This is a common technique used in reverse engineering when recovering higher-level statements from more elementary ones, such as assembly code [6, 10] and COBOL code [17]. As per [10], a definition of a register at instruction in terms of a set of registers,, can be forward substituted at the use of that register on another instruction,, if the definition at is the unique definition of that reaches along all paths in the program, and no register has been redefined along that path. The resulting instruction at would then look as follows:! "! # and the need for the instruction at would disappear. The previous relationship is partly captured by the definition-use (du) and use-definition (ud) chains of an instruction: a use of a register is uniquely defined if it is only reached by one instruction, that is, its ud chain set has only one element. This relationship is known as the -clear$&%(' relationship for register. More formally,! "! " iff ) *,+ - /.10 * * clear$9%:' Note that this definition does not place a restriction on the number of uses of the definition of at. Hence, if the number of elements on +* - is, instruction can potentially be substituted into different instructions ;, provided they satisfy the -clear$9%:'=< property. In our example of Figure 2, the application of forward substitution to the slice found in Section 4.1 gives the following pseudo high-level statements: (3) jcond ([ebp-8] > 7) 0x80489dc (4) jmp [0x8048a0c + ([ebp-8] - 2) * 4] 6
7 4.3. Normal Form Comparison Our previous example can be rewritten in the following way: jcond (var num ) X jmp [T + (var - num ) * w] where var is a local variable, for example, [ebp-8], num is the upper bound for the n-conditional branch, for example, 7, num is the lower bound of the n- conditional branch, for example, 2, T is the indexed table s address (and is of type address), for example, 0x8048a0c, and w is a constant equivalent to the size of the word of the machine; 4 in this example. Based on this information, we can infer that the number of elements in the indexed table is num - num + 1, for a total of 6 in this case. In this example, the elements of the indexed table are labels (i.e. addresses) as the jump is to the target address loaded from the address at [0x8048a0c + ([ebp-8] - 2) * 4]. The previous example only shows one of several normal forms that are used to encode n-conditional branches using an indexed jump table. We call the previous normal form type A. Figure 6 shows the 3 different normal forms that we have identified in executable code that runs on SPARC and Pentium. Normal form A (address) is for indexed tables that contain labels as their values. Normal form O (offset) is for indexed tables that contain offsets from the start of the table to the code of each case. Normal form H (hashing) contains labels or offsets in the indexed table. Form O can also be found in a position independent version. Normal form H contains pairs ( value, address ) at each entry into the indexed table. In our 4 examples of Figures 2 to 5, we find the following normal forms, respectively: 7 jcond (r[24] > 5) 0x80489dc jmp [0x8048a0c + (r[24] * 4)] normal form A 7 jcond (r[9] > 5) 0x10980 jmp [0x (r[9] * 4)] normal form A 7 jcond ((r[8] - 67) > 53) 0x18804 jmp [0x1886c + (((((((r[8] - 67) >> 4) << 1) + (r[8] - 67)) & 15) << 3)] normal form H 7 jcond (((al < 2? 1:0) & 0xff) << 2 ((al < 4? 1:0) & 0xff) << 1 ((al < 8? 1:0) & 0xff) > 7) 0x8057dc5 jmp [0x805f5eb + (((al < 2? 1:0) & 0xff) * 4 ((al < 4? 1:0) & 0xff) * 2 ((al < 8? 1:0) & 0xff)) * 4] normal form A Examples of form O are given in the appendix. 5. Experimental Results We tested the technique for recovery of indexed branch code on Pentium and SPARC in a Solaris environment. The following SPEC95 benchmark programs were used for testing: 7 go: artificial intelligence; plays the game of Go 7 m88ksim: Motorola 88K chip simulator; runs test program 7 gcc: GNU C compiler; builds SPARC code 7 compress: compresses and decompresses a file in memory 7 li: LISP interpreter 7 ijpeg: graphic compression and decompression 7 perl: manipulates strings (anagrams) and prime numbers in Perl 7 vortex: a database program All benchmark programs were compiled with the Sun cc compiler version 4.2 using standard SPEC optimizations (i.e. -O4 on SPARC and -O on Pentium). We also include results for the awk script interpreter utility, and the vi text editor (on both Solaris 2.5 and 2.6). These programs are part of the Unix OS. Figures 7 and 8 show the number of indexed jumps found in each benchmark program, the classification of such indexed jumps into the 3 normal forms (A, O and H), and any unknown types. In the case of SPARC code, most indexed jump tables are of form O, which means that the indexed table stores offsets from the start of the table to the destination target address. In the case of Pentium code, almost all indexed jump tables are of form A, meaning that the table contains the 7
8 Type Normal Form Types of expr allowed A jcond (r[v] num ) X r[v] jmp [T + expr * w] r[v] - numl ((r[v] - numl) << 24) >> 24) O jcond (r[v] num ) X r[v] jmp [T + expr * w] + T or r[v] - numl jmp PC + [PC + ( expr * w + k)] ((r[v] - numl) << 24) >> 24) H jcond (r[v] num ) X ((r[v] - numl) >> s) + (r[v] - numl) jmp [T + (( expr & mask) * 2*w) + w] ((r[v] - numl) >> 2*w) + ((r[v] - numl) >> 2) + (r[v] - numl) Figure 6. Normal forms for n-conditional code after analysis Benchmark A O H Unknown vi (2.5) vi (2.6) go m88ksim gcc compress li ijpeg perl vortex total Figure 7. Number of indexed jumps for SPARC benchmark programs target addresses for each of the entries in the case statement. The primary motivation for this work was to increase our coverage of decoded code in an executable program. We measured the coverage obtained from our technique using the size in bytes of the text segment(s) of the program, compared to the number of bytes decoded and the number of bytes in indexed jump tables. The figures do not necessarily add up to 100% due to unreachable code during the decoding phase. Also, in the case of SPARC, we duplicate some instructions in order to remove delayed branch instructions; this duplication is counted twice in our model, leading to slightly over 100% coverage in rare cases. Figures 9 and 10 show the results of our coverage analysis. The results show that when indexed tables are present in the program, up to 90% more of the code can be reached by decoding such tables correctly. Benchmark A O H Unknown vi (2.6) go m88ksim gcc compress li ijpeg perl vortex total Figure 8. Number of indexed jumps for Pentium benchmark programs The li and ijpeg programs show a small coverage of their code sections. This is due to indirect calls on registers and the fact that decoding is performed by following known paths. In the case of ijpeg, a large percentage of the procedures are reached only via indirect calls, hence they are never decoded. We have not yet extended our analysis techniques to deal with indirect calls. In the context of our binary translation framework, we rely on an interpreter to process such code at runtime. 6. Previous Work Not much work has been published in the literature on recovery of indexed jump targets. These techniques tend to be ad hoc and tailored to a specific platform or compiler, and tend to rely on pattern matching. The qpt binary profiler is a tool to profile and trace code on MIPS and SPARC platforms. Profiling and 8
9 Program w/o analysis with analysis difference awk (2.6) 23% 65% 42% vi (2.5) 24% 97% 73% vi (2.6) 29% 96% 67% go 92% 101% 9% m88ksim 38% 71% 33% gcc 60% 91% 31% compress 91% 91% 0% li 34% 37% 3% ijpeg 20% 22% 2% perl 10% 100% 90% vortex 70% 79% 9% Figure 9. Coverage of code for SPARC benchmarks Program w/o analysis with analysis difference awk (2.6) 22% 64% 42% vi (2.6) 59% 87% 59% go 89% 99% 10% m88ksim 36% 73% 37% gcc 51% 86% 35% compress 82% 82% 0% li 24% 26% 2% ijpeg 18% 20% 2% perl 9% 96% 87% vortex 69% 75% 6% Figure 10. Coverage of code for Pentium benchmarks tracing is done by instrumenting the executable code. Indexed jump tables are detected by relying on the way in which the compiler generated code for the jump, mainly by expecting the table to be in the data segment in the case of MIPS or in the code segment, immediately after the indirect jump, on the SPARC. The end of the table is found by examining the instructions prior to the indirect jump and determining the table s size; alternatively, the text space is scanned until an invalid address is met [18]. The dcc decompiler is an experimental tool for decompiling DOS executables into C code. The method used in this tool was that of pattern matching against known patterns generated by several compilers on a DOS machine [8]. EEL is an executable editing library for RISC machines. Slicing is used to determine the instructions that affect the computation of the indirect jump and determine the indexed jump table. No precise method is given. Measurements on the success of this technique on SPARC using the SPEC92 benchmarks reveal that 100% recovery of indexed jumps is achieved for code compiled by the gcc and the Sun Fortran compilers, and 89% for the SunPro compilers. The recovery ratio was measured by counting the number of indirect jumps expected and recovered [19]. IDA Pro, a Pentium disassembler, makes use of compiler patterns to determine which compiler was used to compile the original source program [2]. IDA Pro s recovery of indexed jump tables is good but their technique has not been documented in the literature. Our techniques compare favourably with those of other tools. They have been tested extensively with code generated from different compilers on both CISC and RISC machines, indicating the generality and machine independence of the technique. 7. Conclusions We have presented a technique based on slicing and expression substitution to understand and recover the code of n-conditional branches implemented by indexed jump tables. The technique is suitable for recovery of code in machine-code manipulation tools such as binary translators, code instrumentors and decompilers. Our technique has been tested on Pentium and SPARC code in a Solaris environment against the SPEC95 integer benchmark programs. Over 500 n- conditional branches were correctly detected in these programs, making this technique suitable for path coverage during the decoding of machine instructions. For the perl benchmark, over 90% of extra code was decoded due to this recovery technique. This work is part of the retargetable binary translation project at The University of Queensland. For more information refer to: 9
10 A. Appendix Figures 11 and 12 illustrate two examples of form O from SPARC code. The former contains an indexed table of offsets from the table to the code that handles each individual switch case. The latter also contains an indexed table of offsets from the table to the code, however, the way the address of the table is calculated is position independent code (via the call to.+8, which produces the side effect of setting the o7 register with the current program counter). 10a58: 0x0009c! Indexed table 10a5c: 0x000dc! of offsets 10a60: 0x000fc 10a64: 0x0011c sethi %hi(0x10800), %l1! Set table address add %l1, 0x258, %l1! into %l1 ld [%fp - 4], %l0! Read idx variable sub %l0, 2, %o0! Subtract min val cmp %o0, 5! Cmp with range-1 bgu 0x10b14! Exit if out of range sll %o0, 2, %o0! Multiply by 4 ld [%o0 + %l1], %o0! Fetch from table jmp %o0 + %l1! Jump to table+offset nop! Delay slot instr Figure 11. Form O example for SPARC assembly code. References [1] Unix System V: Application Binary Interface. Prentice Hall, Englewood Cliffs, New Jersey, [2] Ida PRO disassembler, Data Rescue [3] D. Asheim. The wine project [4] T. Ball and J. Larus. Optimally profiling and tracing programs. Transactions of Programming Languages and Systems, 16(4): , July [5] R. Bernstein. Producing good code for the case statement. Software Practice and Experience, 15(10): , [6] C. Cifuentes. Interprocedural dataflow decompilation. Journal of Programming Languages, 4(2):77 99, June ldsb [%l6], %o0! Get switch var clr %i3! (Not relevant) sub %o0, 2, %o0! Subtract min value cmp %o0, 54! Cmp with range-1 bgu 0x44acc! Exit if out of range sll %o0, 2, %o0! Multiply by 4 call.+8! Set %o7 = pc sethi %hi(0x0), %g1! Set %g1 = or %g1, 0x1c, %g1! 0x0001c add %o0, %g1, %o0! %o0 = 0x43eb8 + 0x1c! = 0x43ed4 ld [%o7 + %o0], %o0! Fetch from table jmp %o7 + %o0! Jump to table+offset nop! Delay slot instr 43ed4: 0x0021c! Table of offsets from 43ed8: 0x00af4! table to case code 43edc: 0x000f8! e.g. 0x43ed4 + 0x00f8! = 0x43fcc 43ee0: 0x008d0 Figure 12. Form O example for SPARC assembly code using position independent code. [7] C. Cifuentes and A. Fraboulet. Intraprocedural slicing of binary executables. In M. Harrold and G. Visaggio, editors, Proceedings of the International Conference on Software Maintenance, pages , Bari, Italy, 1 3 October IEEE-CS Press. [8] C. Cifuentes and K. Gough. Decompilation of binary programs. Software Practice and Experience, 25(7): , July [9] C. Cifuentes and S. Sendall. Specifying the semantics of machine instructions. In Proceedings of the International Workshop on Program Comprehension, pages , Ischia, Italy, June IEEE CS Press. [10] C. Cifuentes, D. Simon, and A. Fraboulet. Assembly to high-level language translation. In Proceedings of the International Conference on Software Maintenance, pages , Washington DC, USA, November IEEE CS Press. [11] B. Cmelik and D. Keppel. Shade: A fast instructionset simulator for execution profiling. In Proceedings ACM SIGMETRICS Conference on Measurement and modeling of Computer Systems, [12] V. Communications. Sourcer - Commenting Disassembler, 8088 to Instruction Set Support. V Communications, Inc, 4320 Stevens Creek Blvd., Suite 275, San Jose, CA 95129, [13] C. Fraser and D. Hanson. A retargetable compiler for ANSI C. SIGPLAN Notices, 26(10):29 43, Oct [14] C. Fraser and D. Hanson. A Retargetable C Compiler: Design and Implementation. The Ben- 10
11 jamin/cummings Publishing Company, Inc., Redwood City, CA, [15] L. Freeman. Don t let Missing Source Code stall your Year 2000 Project Year 2000 Survival Guide. [16] J. Hennessy and N. Mendelsohn. Compilation of the Pascal case statement. Software Practice and Experience, 12: , [17] H. Huang, W. Tsai, S. Bhattacharya, X. Chen, and Y. Wang. Business rule extraction techniques for COBOL programs. Journal of Software Maintenance: Research and Practice, 10(1):3 35, Jan [18] J. Larus and T. Ball. Rewriting executable files to measure program behavior. Software Practice and Experience, 24(2): , Feb [19] J. Larus and E. Schnarr. EEL: Machine-independent executable editing. In SIGPLAN Conference on Programming Languages, Design and Implementation, pages , June [20] A. Sale. The implementation of case statements in Pascal. Software Practice and Experience, 11: , [21] R. Sites, A. Chernoff, M. Kirk, M. Marks, and S. Robinson. Binary translation. Commun. ACM, 36(2):69 81, Feb [22] T. Thompson. An Alpha in PC clothing. Byte, pages , Feb [23] TracePoint. HiProf hierarchical profiler. hiprof/products/hiprof/, [24] N. Wirth and C. Hoare. A contribution to the development of ALGOL. Communications of the ACM, 9(6): , [25] C. Wrandle. Notes on the case statement. Software Practice and Experience, 4: ,
Instruction Set Architecture (ISA)
Instruction Set Architecture (ISA) * Instruction set architecture of a machine fills the semantic gap between the user and the machine. * ISA serves as the starting point for the design of a new machine
CS:APP Chapter 4 Computer Architecture Instruction Set Architecture. CS:APP2e
CS:APP Chapter 4 Computer Architecture Instruction Set Architecture CS:APP2e Instruction Set Architecture Assembly Language View Processor state Registers, memory, Instructions addl, pushl, ret, How instructions
Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters
Interpreters and virtual machines Michel Schinz 2007 03 23 Interpreters Interpreters Why interpreters? An interpreter is a program that executes another program, represented as some kind of data-structure.
CS61: Systems Programing and Machine Organization
CS61: Systems Programing and Machine Organization Fall 2009 Section Notes for Week 2 (September 14 th - 18 th ) Topics to be covered: I. Binary Basics II. Signed Numbers III. Architecture Overview IV.
Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z) [email protected] http://www.mzahran.com
CSCI-UA.0201-003 Computer Systems Organization Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z) [email protected] http://www.mzahran.com Some slides adapted (and slightly modified)
Instruction Set Architecture
CS:APP Chapter 4 Computer Architecture Instruction Set Architecture Randal E. Bryant adapted by Jason Fritts http://csapp.cs.cmu.edu CS:APP2e Hardware Architecture - using Y86 ISA For learning aspects
CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08
CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 20: Stack Frames 7 March 08 CS 412/413 Spring 2008 Introduction to Compilers 1 Where We Are Source code if (b == 0) a = b; Low-level IR code
Assembly Language: Function Calls" Jennifer Rexford!
Assembly Language: Function Calls" Jennifer Rexford! 1 Goals of this Lecture" Function call problems:! Calling and returning! Passing parameters! Storing local variables! Handling registers without interference!
Computer Organization and Architecture
Computer Organization and Architecture Chapter 11 Instruction Sets: Addressing Modes and Formats Instruction Set Design One goal of instruction set design is to minimize instruction length Another goal
Machine-Level Programming II: Arithmetic & Control
Mellon Machine-Level Programming II: Arithmetic & Control 15-213 / 18-213: Introduction to Computer Systems 6 th Lecture, Jan 29, 2015 Instructors: Seth Copen Goldstein, Franz Franchetti, Greg Kesden 1
Language Processing Systems
Language Processing Systems Evaluation Active sheets 10 % Exercise reports 30 % Midterm Exam 20 % Final Exam 40 % Contact Send e-mail to [email protected] Course materials at www.u-aizu.ac.jp/~hamada/education.html
Caml Virtual Machine File & data formats Document version: 1.4 http://cadmium.x9c.fr
Caml Virtual Machine File & data formats Document version: 1.4 http://cadmium.x9c.fr Copyright c 2007-2010 Xavier Clerc [email protected] Released under the LGPL version 3 February 6, 2010 Abstract: This
LSN 2 Computer Processors
LSN 2 Computer Processors Department of Engineering Technology LSN 2 Computer Processors Microprocessors Design Instruction set Processor organization Processor performance Bandwidth Clock speed LSN 2
Off-by-One exploitation tutorial
Off-by-One exploitation tutorial By Saif El-Sherei www.elsherei.com Introduction: I decided to get a bit more into Linux exploitation, so I thought it would be nice if I document this as a good friend
Central Processing Unit (CPU)
Central Processing Unit (CPU) CPU is the heart and brain It interprets and executes machine level instructions Controls data transfer from/to Main Memory (MM) and CPU Detects any errors In the following
2) Write in detail the issues in the design of code generator.
COMPUTER SCIENCE AND ENGINEERING VI SEM CSE Principles of Compiler Design Unit-IV Question and answers UNIT IV CODE GENERATION 9 Issues in the design of code generator The target machine Runtime Storage
Chapter 7D The Java Virtual Machine
This sub chapter discusses another architecture, that of the JVM (Java Virtual Machine). In general, a VM (Virtual Machine) is a hypothetical machine (implemented in either hardware or software) that directly
Assembly Language Programming
Assembly Language Programming Assemblers were the first programs to assist in programming. The idea of the assembler is simple: represent each computer instruction with an acronym (group of letters). Eg:
Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers
CS 4 Introduction to Compilers ndrew Myers Cornell University dministration Prelim tomorrow evening No class Wednesday P due in days Optional reading: Muchnick 7 Lecture : Instruction scheduling pr 0 Modern
Application Note. Introduction AN2471/D 3/2003. PC Master Software Communication Protocol Specification
Application Note 3/2003 PC Master Software Communication Protocol Specification By Pavel Kania and Michal Hanak S 3 L Applications Engineerings MCSL Roznov pod Radhostem Introduction The purpose of this
18-447 Computer Architecture Lecture 3: ISA Tradeoffs. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013
18-447 Computer Architecture Lecture 3: ISA Tradeoffs Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013 Reminder: Homeworks for Next Two Weeks Homework 0 Due next Wednesday (Jan 23), right
Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2
Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of
Generating a Recognizer from Declarative Machine Descriptions
Generating a Recognizer from Declarative Machine Descriptions João Dias and Norman Ramsey Division of Engineering and Applied Sciences Harvard University {dias,nr}@eecs.harvard.edu Abstract Writing an
The programming language C. sws1 1
The programming language C sws1 1 The programming language C invented by Dennis Ritchie in early 1970s who used it to write the first Hello World program C was used to write UNIX Standardised as K&C (Kernighan
Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX
Overview CISC Developments Over Twenty Years Classic CISC design: Digital VAX VAXÕs RISC successor: PRISM/Alpha IntelÕs ubiquitous 80x86 architecture Ð 8086 through the Pentium Pro (P6) RJS 2/3/97 Philosophy
CSE 141L Computer Architecture Lab Fall 2003. Lecture 2
CSE 141L Computer Architecture Lab Fall 2003 Lecture 2 Pramod V. Argade CSE141L: Computer Architecture Lab Instructor: TA: Readers: Pramod V. Argade ([email protected]) Office Hour: Tue./Thu. 9:30-10:30
Intel 8086 architecture
Intel 8086 architecture Today we ll take a look at Intel s 8086, which is one of the oldest and yet most prevalent processor architectures around. We ll make many comparisons between the MIPS and 8086
Machine Programming II: Instruc8ons
Machine Programming II: Instrucons Move instrucons, registers, and operands Complete addressing mode, address computaon (leal) Arithmec operaons (including some x6 6 instrucons) Condion codes Control,
1/20/2016 INTRODUCTION
INTRODUCTION 1 Programming languages have common concepts that are seen in all languages This course will discuss and illustrate these common concepts: Syntax Names Types Semantics Memory Management We
Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek
Instruction Set Architecture or How to talk to computers if you aren t in Star Trek The Instruction Set Architecture Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture
Chapter 2. Why is some hardware better than others for different programs?
Chapter 2 1 Performance Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation Why is some hardware better than
Instruction Set Design
Instruction Set Design Instruction Set Architecture: to what purpose? ISA provides the level of abstraction between the software and the hardware One of the most important abstraction in CS It s narrow,
Return-oriented programming without returns
Faculty of Computer Science Institute for System Architecture, Operating Systems Group Return-oriented programming without urns S. Checkoway, L. Davi, A. Dmitrienko, A. Sadeghi, H. Shacham, M. Winandy
Jonathan Worthington Scarborough Linux User Group
Jonathan Worthington Scarborough Linux User Group Introduction What does a Virtual Machine do? Hides away the details of the hardware platform and operating system. Defines a common set of instructions.
Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University [email protected].
Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University [email protected] Review Computers in mid 50 s Hardware was expensive
Computer Architectures
Computer Architectures 2. Instruction Set Architectures 2015. február 12. Budapest Gábor Horváth associate professor BUTE Dept. of Networked Systems and Services [email protected] 2 Instruction set architectures
CSCI 3136 Principles of Programming Languages
CSCI 3136 Principles of Programming Languages Faculty of Computer Science Dalhousie University Winter 2013 CSCI 3136 Principles of Programming Languages Faculty of Computer Science Dalhousie University
Software: Systems and Application Software
Software: Systems and Application Software Computer Software Operating System Popular Operating Systems Language Translators Utility Programs Applications Programs Types of Application Software Personal
Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices
Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices Eric Schulte 1 Jonathan DiLorenzo 2 Westley Weimer 2 Stephanie Forrest 1 1 Department of Computer Science University of
Fast Arithmetic Coding (FastAC) Implementations
Fast Arithmetic Coding (FastAC) Implementations Amir Said 1 Introduction This document describes our fast implementations of arithmetic coding, which achieve optimal compression and higher throughput by
Chapter 1. Dr. Chris Irwin Davis Email: [email protected] Phone: (972) 883-3574 Office: ECSS 4.705. CS-4337 Organization of Programming Languages
Chapter 1 CS-4337 Organization of Programming Languages Dr. Chris Irwin Davis Email: [email protected] Phone: (972) 883-3574 Office: ECSS 4.705 Chapter 1 Topics Reasons for Studying Concepts of Programming
CPU Organization and Assembly Language
COS 140 Foundations of Computer Science School of Computing and Information Science University of Maine October 2, 2015 Outline 1 2 3 4 5 6 7 8 Homework and announcements Reading: Chapter 12 Homework:
Jakstab: A Static Analysis Platform for Binaries
Jakstab: A Static Analysis Platform for Binaries (Tool Paper) Johannes Kinder and Helmut Veith Technische Universität Darmstadt, 64289 Darmstadt, Germany Abstract. For processing compiled code, model checkers
CS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of
CS:APP Chapter 4 Computer Architecture Wrap-Up William J. Taffe Plymouth State University using the slides of Randal E. Bryant Carnegie Mellon University Overview Wrap-Up of PIPE Design Performance analysis
ASSEMBLY LANGUAGE PROGRAMMING (6800) (R. Horvath, Introduction to Microprocessors, Chapter 6)
ASSEMBLY LANGUAGE PROGRAMMING (6800) (R. Horvath, Introduction to Microprocessors, Chapter 6) 1 COMPUTER LANGUAGES In order for a computer to be able to execute a program, the program must first be present
Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T)
Unit- I Introduction to c Language: C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating
a storage location directly on the CPU, used for temporary storage of small amounts of data during processing.
CS143 Handout 18 Summer 2008 30 July, 2008 Processor Architectures Handout written by Maggie Johnson and revised by Julie Zelenski. Architecture Vocabulary Let s review a few relevant hardware definitions:
Network Security Using Job Oriented Architecture (SUJOA)
www.ijcsi.org 222 Network Security Using Job Oriented Architecture (SUJOA) Tariq Ahamad 1, Abdullah Aljumah 2 College Of Computer Engineering & Sciences Salman Bin Abdulaziz University, KSA ABSTRACT In
Instruction Set Architecture
Instruction Set Architecture Consider x := y+z. (x, y, z are memory variables) 1-address instructions 2-address instructions LOAD y (r :=y) ADD y,z (y := y+z) ADD z (r:=r+z) MOVE x,y (x := y) STORE x (x:=r)
More on Pipelining and Pipelines in Real Machines CS 333 Fall 2006 Main Ideas Data Hazards RAW WAR WAW More pipeline stall reduction techniques Branch prediction» static» dynamic bimodal branch prediction
Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C
Embedded Systems A Review of ANSI C and Considerations for Embedded C Programming Dr. Jeff Jackson Lecture 2-1 Review of ANSI C Topics Basic features of C C fundamentals Basic data types Expressions Selection
W4118: segmentation and paging. Instructor: Junfeng Yang
W4118: segmentation and paging Instructor: Junfeng Yang Outline Memory management goals Segmentation Paging TLB 1 Uni- v.s. multi-programming Simple uniprogramming with a single segment per process Uniprogramming
2010-2011 Assessment for Master s Degree Program Fall 2010 - Spring 2011 Computer Science Dept. Texas A&M University - Commerce
2010-2011 Assessment for Master s Degree Program Fall 2010 - Spring 2011 Computer Science Dept. Texas A&M University - Commerce Program Objective #1 (PO1):Students will be able to demonstrate a broad knowledge
1 The Java Virtual Machine
1 The Java Virtual Machine About the Spec Format This document describes the Java virtual machine and the instruction set. In this introduction, each component of the machine is briefly described. This
Cloud Computing. Up until now
Cloud Computing Lecture 11 Virtualization 2011-2012 Up until now Introduction. Definition of Cloud Computing Grid Computing Content Distribution Networks Map Reduce Cycle-Sharing 1 Process Virtual Machines
Static Taint-Analysis on Binary Executables
Static Taint-Analysis on Binary Executables Sanjay Rawat, Laurent Mounier, Marie-Laure Potet VERIMAG University of Grenoble October 2011 Static Taint-Analysis on Binary Executables 1/29 Outline 1 Introduction
picojava TM : A Hardware Implementation of the Java Virtual Machine
picojava TM : A Hardware Implementation of the Java Virtual Machine Marc Tremblay and Michael O Connor Sun Microelectronics Slide 1 The Java picojava Synergy Java s origins lie in improving the consumer
Organization of Programming Languages CS320/520N. Lecture 05. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.
Organization of Programming Languages CS320/520N Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected] Names, Bindings, and Scopes A name is a symbolic identifier used
Lecture 27 C and Assembly
Ananda Gunawardena Lecture 27 C and Assembly This is a quick introduction to working with x86 assembly. Some of the instructions and register names must be check for latest commands and register names.
Chapter 4 Processor Architecture
Chapter 4 Processor Architecture Modern microprocessors are among the most complex systems ever created by humans. A single silicon chip, roughly the size of a fingernail, can contain a complete high-performance
Chapter 5 Names, Bindings, Type Checking, and Scopes
Chapter 5 Names, Bindings, Type Checking, and Scopes Chapter 5 Topics Introduction Names Variables The Concept of Binding Type Checking Strong Typing Scope Scope and Lifetime Referencing Environments Named
3 SOFTWARE AND PROGRAMMING LANGUAGES
3 SOFTWARE AND PROGRAMMING LANGUAGES 3.1 INTRODUCTION In the previous lesson we discussed about the different parts and configurations of computer. It has been mentioned that programs or instructions have
what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?
Inside the CPU how does the CPU work? what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored? some short, boring programs to illustrate the
Introduction to MIPS Assembly Programming
1 / 26 Introduction to MIPS Assembly Programming January 23 25, 2013 2 / 26 Outline Overview of assembly programming MARS tutorial MIPS assembly syntax Role of pseudocode Some simple instructions Integer
Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: [email protected] Alfonso Ortega: alfonso.ortega@uam.
Compilers Spring term Mick O Donnell: [email protected] Alfonso Ortega: [email protected] Lecture 1 to Compilers 1 Topic 1: What is a Compiler? 3 What is a Compiler? A compiler is a computer
EE361: Digital Computer Organization Course Syllabus
EE361: Digital Computer Organization Course Syllabus Dr. Mohammad H. Awedh Spring 2014 Course Objectives Simply, a computer is a set of components (Processor, Memory and Storage, Input/Output Devices)
PCSpim Tutorial. Nathan Goulding-Hotta 2012-01-13 v0.1
PCSpim Tutorial Nathan Goulding-Hotta 2012-01-13 v0.1 Download and install 1. Download PCSpim (file PCSpim_9.1.4.zip ) from http://sourceforge.net/projects/spimsimulator/files/ This tutorial assumes you
How It All Works. Other M68000 Updates. Basic Control Signals. Basic Control Signals
CPU Architectures Motorola 68000 Several CPU architectures exist currently: Motorola Intel AMD (Advanced Micro Devices) PowerPC Pick one to study; others will be variations on this. Arbitrary pick: Motorola
CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler
CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler 1) Operating systems a) Windows b) Unix and Linux c) Macintosh 2) Data manipulation tools a) Text Editors b) Spreadsheets
Assessment Plan for CS and CIS Degree Programs Computer Science Dept. Texas A&M University - Commerce
Assessment Plan for CS and CIS Degree Programs Computer Science Dept. Texas A&M University - Commerce Program Objective #1 (PO1):Students will be able to demonstrate a broad knowledge of Computer Science
A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc
Other architectures Example. Accumulator-based machines A single register, called the accumulator, stores the operand before the operation, and stores the result after the operation. Load x # into acc
Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings: 9.1-9.
Code Generation I Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings: 9.1-9.7 Stack Machines A simple evaluation model No variables
Instruction Set Architecture (ISA) Design. Classification Categories
Instruction Set Architecture (ISA) Design Overview» Classify Instruction set architectures» Look at how applications use ISAs» Examine a modern RISC ISA (DLX)» Measurement of ISA usage in real computers
Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives
Introduction to Programming and Algorithms Module 1 CS 146 Sam Houston State University Dr. Tim McGuire Module Objectives To understand: the necessity of programming, differences between hardware and software,
Computer Organization and Components
Computer Organization and Components IS5, fall 25 Lecture : Pipelined Processors ssociate Professor, KTH Royal Institute of Technology ssistant Research ngineer, University of California, Berkeley Slides
Lesson 06: Basics of Software Development (W02D2
Lesson 06: Basics of Software Development (W02D2) Balboa High School Michael Ferraro Lesson 06: Basics of Software Development (W02D2 Do Now 1. What is the main reason why flash
l C-Programming l A real computer language l Data Representation l Everything goes down to bits and bytes l Machine representation Language
198:211 Computer Architecture Topics: Processor Design Where are we now? C-Programming A real computer language Data Representation Everything goes down to bits and bytes Machine representation Language
How To Trace
CS510 Software Engineering Dynamic Program Analysis Asst. Prof. Mathias Payer Department of Computer Science Purdue University TA: Scott A. Carr Slides inspired by Xiangyu Zhang http://nebelwelt.net/teaching/15-cs510-se
Dongwoo Kim : Hyeon-jeong Lee s Husband
2/ 32 Who we are Dongwoo Kim : Hyeon-jeong Lee s Husband Ph.D. Candidate at Chungnam National University in South Korea Majoring in Computer Communications & Security Interested in mobile hacking, digital
Compilers I - Chapter 4: Generating Better Code
Compilers I - Chapter 4: Generating Better Code Lecturers: Paul Kelly ([email protected]) Office: room 304, William Penney Building Naranker Dulay ([email protected]) Materials: Office: room 562 Textbook
ProTrack: A Simple Provenance-tracking Filesystem
ProTrack: A Simple Provenance-tracking Filesystem Somak Das Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology [email protected] Abstract Provenance describes a file
Introduction. Application Security. Reasons For Reverse Engineering. This lecture. Java Byte Code
Introduction Application Security Tom Chothia Computer Security, Lecture 16 Compiled code is really just data which can be edit and inspected. By examining low level code protections can be removed and
Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation
Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation Satish Narayanasamy, Cristiano Pereira, Harish Patil, Robert Cohn, and Brad Calder Computer Science and
PROBLEMS #20,R0,R1 #$3A,R2,R4
506 CHAPTER 8 PIPELINING (Corrisponde al cap. 11 - Introduzione al pipelining) PROBLEMS 8.1 Consider the following sequence of instructions Mul And #20,R0,R1 #3,R2,R3 #$3A,R2,R4 R0,R2,R5 In all instructions,
How To Write Portable Programs In C
Writing Portable Programs COS 217 1 Goals of Today s Class Writing portable programs in C Sources of heterogeneity Data types, evaluation order, byte order, char set, Reading period and final exam Important
QEMU, a Fast and Portable Dynamic Translator
QEMU, a Fast and Portable Dynamic Translator Fabrice Bellard Abstract We present the internals of QEMU, a fast machine emulator using an original portable dynamic translator. It emulates several CPUs (x86,
C Compiler Targeting the Java Virtual Machine
C Compiler Targeting the Java Virtual Machine Jack Pien Senior Honors Thesis (Advisor: Javed A. Aslam) Dartmouth College Computer Science Technical Report PCS-TR98-334 May 30, 1998 Abstract One of the
Building Applications Using Micro Focus COBOL
Building Applications Using Micro Focus COBOL Abstract If you look through the Micro Focus COBOL documentation, you will see many different executable file types referenced: int, gnt, exe, dll and others.
Chapter 6, The Operating System Machine Level
Chapter 6, The Operating System Machine Level 6.1 Virtual Memory 6.2 Virtual I/O Instructions 6.3 Virtual Instructions For Parallel Processing 6.4 Example Operating Systems 6.5 Summary Virtual Memory General
PROBLEMS (Cap. 4 - Istruzioni macchina)
98 CHAPTER 2 MACHINE INSTRUCTIONS AND PROGRAMS PROBLEMS (Cap. 4 - Istruzioni macchina) 2.1 Represent the decimal values 5, 2, 14, 10, 26, 19, 51, and 43, as signed, 7-bit numbers in the following binary
X86-64 Architecture Guide
X86-64 Architecture Guide For the code-generation project, we shall expose you to a simplified version of the x86-64 platform. Example Consider the following Decaf program: class Program { int foo(int
MICROPROCESSOR AND MICROCOMPUTER BASICS
Introduction MICROPROCESSOR AND MICROCOMPUTER BASICS At present there are many types and sizes of computers available. These computers are designed and constructed based on digital and Integrated Circuit
A JIT Compiler for Android s Dalvik VM. Ben Cheng, Bill Buzbee May 2010
A JIT Compiler for Android s Dalvik VM Ben Cheng, Bill Buzbee May 2010 Overview View live session notes and ask questions on Google Wave: http://bit.ly/bizjnf Dalvik Environment Trace vs. Method Granularity
Central Processing Unit Simulation Version v2.5 (July 2005) Charles André University Nice-Sophia Antipolis
Central Processing Unit Simulation Version v2.5 (July 2005) Charles André University Nice-Sophia Antipolis 1 1 Table of Contents 1 Table of Contents... 3 2 Overview... 5 3 Installation... 7 4 The CPU
VLIW Processors. VLIW Processors
1 VLIW Processors VLIW ( very long instruction word ) processors instructions are scheduled by the compiler a fixed number of operations are formatted as one big instruction (called a bundle) usually LIW
