SOURCE CODE OBFUSCATION BY MEAN OF EVOLUTIONARY ALGORITHMS

Size: px
Start display at page:

Download "SOURCE CODE OBFUSCATION BY MEAN OF EVOLUTIONARY ALGORITHMS"

Transcription

1 SOURCE CODE OBFUSCATION BY MEAN OF EVOLUTIONARY ALGORITHMS Sébastien Martinez 2011 Tutor : Sébastien Varrette Advisor : Benoît Bertholon University of Luxembourg, Faculty of Sciences, Technologies and Communications Master Informatique Spécialité Recherche en Informatique TELECOM Bretagne 1

2 2 RELATED WORK 1 Introduction Usually, when talking about security, the matter is about protecting a computer from intrusions or malicious software. Here, the matter will be about how to protect software from piracy. More precisely, how to make a program able to run without letting the user know its composition. To achieve this goal, distributing binaries instead of source code is not enough since debuggers and decompilers can be used to help get the secret algorithm or data structure one does not want to be known by the user. The solution would be having code that is impossible to understand for the user, but since this goal cannot be reached, the code will have to be complicated enough so that users wanting to get secret algorithms will either give up, either obtain the algorithm when it is obsolete (e.g. when a new, better version is available). The techniques used for this purpose are called Obfuscation techniques There are several reasons why someone would want to obfuscate his or her code. The most common reason would be to hide an algorithm from eavesdroppers while executing the code on an unsecured platform i.e. a public cloud. Code obfuscation is not only used to keep some pieces of code secret, it can as well be used to introduce a fingerprint into the software for each user allowing the detection of the user of a specific version of the code. For example, one would want to make special copy of his or her software for each person he or she distributes the software to. Hence, if illegal copies of the software are found it would be easy to trace the person who distributed the pirated copies. Since there are no easy ways to find the best transformations for a given program, we will study the possibility of using evolutionary algorithms to find the best obfuscated program accordingly to the criterion we give in this article. On a first step, a short overview of evolutionary algorithms and of the source to source compiler IS will be given before listing several software complexity metrics that can be used to measure obfuscation transformations efficiency. Then, obfuscation techniques will be classified based on the parts of the program they affect. Before summarizing the obfuscation metrics, the matter of deobfuscators will be tackled. Eventually, we will detail how the use of evolutionary algorithm is planned for finding best obfuscated programs. 2 Related Work 2.1 Evolutionnary Algorithm Evolutionary Algorithm (EA) is a class of solving techniques based on the Darwinian theory of evolution [8] which involves the search of a population X t of solutions. Members of the population are feasible solutions and called individuals. Each iteration of an EA involves a competitive selection that weeds out poor solutions through the evaluation of a fitness value that indicates the quality of the individual as a solution to the problem. The evolutionary process involves at each generation a set of stochastic operators that are applied on the individuals, typically recombination (or cross-over) and mutation. There exists many useful models of Evolutionary Algorithms (EAs) yet a pseudo-code of a general execution scheme is provided in the Algorithm 1. Algorithm 1: General scheme of an EA in pseudocode. t := 0; Generation(X t ) // generate the initial population Evaluation(X t ) // evaluate population while Stopping criteria not satisfied do ˆX t = arentsselection(x t ); // select parents X t = Modification( ˆX t ); // cross-over + mutation Evaluation(X t) // evaluate offspring X t+1 = Selection(X t, X t) // select survivors for the next generation t := t + 1; Execution of simple EA requires high computational resources in case of non-trivial problems. It might be encountered when dealing with large individuals (e.g. in case of Genetic Algorithm (GA) long sequences of genes, in case of Genetic rogramming (G) large parse trees) and/or large populations. This influences time required to evaluate the population, which usually is the costliest operation in EAs. In such cases, time-to-solution on a single computer is prohibitively long for practitioners (especially with usage of G). Such example of highly expensive EA for a computer vision problem is described in [13], where more than 24h are required to execute the algorithm. Another instance of even bigger requirements was reported by Melab et al. in [12], where predictive mathematical model for the concentration of sugar in beets was constructed using parallel GA cumulative CU

3 2.2 IS - source code compiler 3 TAXONOMY OF OBFUSCATION METHODS time exceeded 27 days. 2.2 IS - source code compiler IS (arallélisation Interprocédurale de rogrammes Scientifiques) [9], [1] is an interprocedural source to source compiler analyzing C and Fortran programs and transforming them to optimize parallel executions of these programs. IS can apply transformations that can be used to obfuscate the input code like loop unrolling or variable renaming. Moreover, IS can use SIMD instructions to accelerate a program by means of vectorial instructions. SIMD introduces calls to intrinsics that can be hard to understand for programmers who don t know them. Using IS as a tool for operating transformations on programs, we gain a powerful tool for code obfuscation. The python frontend pyps allows the user to specify transformations to be applied or specific data structures to be used. In the list of transformations mentioned in this article, several are already available in IS. 3 Taxonomy of Obfuscation Methods Obfuscation methods can take many forms, and can affect many parts of a program : the data structures used, the functions called or even the textual representation of the source code (e.g. Suppressing any indentation). Colberg and Nagra proposed to use this fact to watermark or birthmark program [7], the mark being inserted using some transformations applied to the program. When obfuscating a program, some dead code is often inserted. If instead some special code making the program behave in an unwanted way by the user is inserted, we could tamperproof the program. Any reverse engineer would face side effects if he executed these pieces of dead code. In this article, we will focus on the usage of transformation to obfuscate the program. Although the transformations used for watermarking or tamperproofing are similar to the one that will be listed is this section. On a first step, the definition of a obfuscating transformation and of a transformation quality will be given. Then some complexity software metric will be enumerated before giving a non exhaustive list of obfuscation transformations based on the work of Colberg et al. [7], [5]. Then, before summing up the transformation qualities, the subject of deobfuscator will be tackled. 3.1 reliminary definitions In order to classify and evaluate obfuscation transformations, we will need to define several notions. Definition 1 (Obfuscating Transformation). Let τ be a transformation of a source program into a target. τ is an obfuscation transformation if and have the same observable behavior. More precisely, the following conditions are respected : If fails to terminate or terminate with an error condition, then may or may not terminate Otherwise must terminate and produce the same output as Observable behavior can be defined as being the behavior experienced by the user. This means everything the user can notice at first sight. Hence, if has side effects (new created files, network communications ) that are not noticed by the user, it can still have the same observable behavior (provided it has the same user experienced effects as ). In order to evaluate the quality of obfuscation transformations, we need to define several transformation properties and metrics. The three main properties being otency, Resilience and Cost. otency can be considered as a measure of a transformation usefulness in its task of hiding the intent of the program coder. otency can be seen as a measure of an obfuscation transformation efficiency toward human readers. Resilience can be seen as a measure of an obfuscation transformation efficiency toward automatic deobfuscators (as an opposition to potency) Transformation cost measures the penalty introduced by the transformation : a transformation can make the program use more memory or more time. These three measures compose the quality of a transformation. Definition 2 (Transformation otency). Let τ be a behavior-conserving transformation, such that τ transforms a source program into a target program. Let E( ) be the complexity of. τ pot ( ), the potency of τ with respect to a program is a measure of the extent to which τ changes the complexity of. It is defined as

4 3.2 Metrics 3 TAXONOMY OF OBFUSCATION METHODS τ pot ( ) = E( )/E( ) 1 We say τ is a potent obfuscating transformation if τ pot ( ) > 0. In this definition E a measure of complexity. Since there are many software complexity measures, one has to be chosen. Several metrics will be listed in the next subsection. Software complexity metrics are often subjective and some transformation will increase the program complexity according to the metric in use while the deobfuscation of these transformations are really simple for a machine though uneasy for a human reader as we will see further. Hence, potency can be pictured as a measure of a transformation usefulness toward human readers. To measure a transformation usefulness toward automatic deobfuscators, resilience has to be introduced. Resilience takes two parameters in consideration : rogrammer Effort (the amount of time taken to build an automatic deobfuscator that will efficiently reduce the potency of τ) and Deobfuscator Effort (the execution time and the memory space required by the obfuscator to reduce efficiently the potency of τ). Definition 3 (Transformation Resilience). Let τ be a behavior-conserving transformation, such that τ transforms a source program into a target program. τ res ( ) is the resilience of τ with respect to a program. τ res ( ) = one-way if information is removed from such that cannot be reconstructed from. Otherwise, τ res = Res(τ Deobfuscatoreffort, τ rogrammereffort ) Where Res, the Resilience is the function defined by the matrix defined in the matrix in Figure 1 Transformations often introduce some loss in the program. The program can need more memory space or more time to terminate after the application of a transformation. Transformation cost introduces this notion. Definition 4 (Transformation Cost). Let τ be a behavior-conserving transformation, such that τ transforms a source program into a target program. τ cost ( ) is the extra execution time/space of compared to. τ cost ( ) is dear if executing requires exponentially more resources than costly if executing requires O(n p ), p > 1, more resources than cheap if executing requires O(n) more resources than free if executing requires O(1) more resources than otency, resilience and cost compose the quality metric of obfuscating transformations. Definition 5 (Transformation quality). τ qual ( ), the quality of a transformation τ, is defined as the combination of the potency, resilience, and cost of τ τ qual ( ) = (τ pot ( ), τ res ( ), τ cost ( )) Now that obfuscating transformations and transformation quality have been defined, several software complexity metrics will be tackled. Combining these notions will enable the evaluation of the different obfuscation transformations that will be listed further in this article. 3.2 Metrics Software complexity doesn t have one metric, software complexity because the complexity of a program can have many aspects, many of them being subjective. Moreover, complexity metrics depend on the language we use, more precisely on its paradigm. Hence, we have to chose the most adapted metric to our context. Since we want to classify and compare obfuscation transformation, we will have to consider several metrics depending on the transformation and the elements it affects. McCabe proposes a graph theory oriented metric [11] in which the control flow of programs is seen as graphs. Here, a program complexity is measured by the number of linearly independent paths which is equal to e n + p in strongly connected graphs (e being the number of edges, n the number of vertices and p the number of connected components of the graph). Control flows of programs being assumed to have a strongly connected structure, we can see how adding more independent paths in a program can increase its complexity. Chidamber and Kemerer listed several metrics for object oriented programs [4] like giving weight to classes, measuring coupling between classes (i.e. evaluating the interactions between classes) or the

5 3.3 Obfuscation Techniques 3 TAXONOMY OF OBFUSCATION METHODS rogrammer effort Inter process full full Inter procedural strong full trivial weak strong full one-way Global weak strong Low resilience High resilience Local trivial weak oly time Exp time Deobfuscator effort Figure 1: Resilience of obfuscating transformations : Scale of values (left) and resilience matrix (right) lack of cohesion in methods (i.e. measure the similarity between two methods counting the instance variables used in common). When not using object oriented program, some parallel lines can be drawn with data structures (e.g. Measuring global variable or data structures used by several functions, evaluation interactions between variables ). Colberg et al. referenced the most popular software complexity metrics [5]. Each of them will be written µn in the following and will determine a specific metric applying on functions, data or the whole program. µ1 rogram Length : The more has operators and operands, the more complex it gets. µ2 Cyclomatic Complexity : The complexity of a function is measured by the number of predicates it contains. µ3 Nesting Complexity : The more conditionals of a function are nested, the more complex that function is. µ4 Data Flow Complexity : The complexity of a function increases with the number of variables references in inter-basic blocks. µ5 Fan-in/out Complexity : A function is more complex if it has more formal parameters, its complexity also increases with number of global data structures it reads or writes. µ6 Data Structure Complexity : The complexity of a program increases with the complexity of the static data structure it uses. Scalar variable have a constant complexity. Arrays complexity increases with their number of dimension and the complexity of their element type. 3.3 Obfuscation Techniques Based on the metrics previously enumerated, a first way to obfuscate a program would be to increase the complexity of its data structures and of its functions. But in this section we will see that the best efficiency is accessible when combining theses types of transformation and when mixing variable and functions usages in order to make the control flow more complex. The notion of opaque construct will also be introduced. Obfuscation techniques can be classified in three categories based on the parts of the program it affects. The three main classes are layout obfuscation, data obfuscation and control obfuscation Data obfuscation Data obfuscation gathers all the transformations that obscur the data structures used in a program. For example, splitting a vector in two vectors is a data obfuscation technique. We can distinguish three classes of transformations : transformations affecting the storage, the encoding, the ordering or the aggregation of the data.

6 3.3 Obfuscation Techniques 3 TAXONOMY OF OBFUSCATION METHODS When choosing data structures, the most adapted way for storing or encoding the data is usually chosen. For example, when coding a 16 bit int, we represent the value 6 by , respecting conventions. We could decide not to respect conventions and decide that the previous bit pattern would code the value 4. Changing encoding A typical example of encoding transformation would be to use more than one variable to encode one value. For instance, if we want to transform the variable k in, we can use constants and use c 1 k + c 2 instead of k in. There is a trade-off between resilience and potency and between resilience and cost. The previous example has a little impact on the execution time of but common compiler analysis can deobfuscate such a transformation. (c 1 = 5c 2 = 2) int k; for (k=1;k<100;k++) { vect[k] int k; for (k=7;k<502;k++) { vect[(k-2)/5] k+=4; romoting variables romoting a variable means replacing a specialized storage structure by a more general one. For example, in a language such as Java, an integer typed variable can be replaced by an Integer class. Such transformation usually has a low resilience and potency but can be more effective when used in conjunction with other transformations. The variable promotion could also be an increasing of its lifetime, like making a local variable be global. Such a transformation increases the number of global variable used by the program functions. void foo() { int i; i void bar() { int k; k int c; void foo() { c void bar() { c Splitting variables Splitting a variable i means replacing it by a set of variables (i 1, i k ). Three pieces of information have to be given : a function f(i 1,, i k ) that maps the i 1,, i k to i, a function g(i) that maps i to the corresponding i 1,, i k and operations on i 1,, i k corresponding to the operations available on i. The potency and cost of such transformations increases with k, hence this transformation is usually applied for k = 2 or 3. Converting static data to procedural data These transformations replace static data by a function that returns this data. Many pieces of data can be replaced by a function taking one parameter and returning one of these pieces of data depending on the given parameter. Since storing all the static data in one function is not desirable at all, we can split this function into many functions spared in the program control flow. "abc" "dfe" string foo(int a) { if (a == 0) return "abc" elif (a == 1) return "dfe" foo(0) foo(1) Aggregation Transformations The same way splitting data obfuscate the code, aggregating data adds obscurity to the code. One could merge several variable in one. For example, a 64-bit integer could store two 32-bits integers. Or a k-size array could store k variables sharing the same type. These transformations have low resilience since a deobfuscator only needs to study the operations on the aggregated data. Still, we can insert fake operations in blocks of dead code. One would also restructure arrays : merging several arrays in one, splitting an array into several arrays, folding an array (increasing its dimension) or flattening an array (decreasing its dimension). These transformations often have low potency because complexity metrics cannot measure the fact that some of these transformations introduce new struc-

7 3.3 Obfuscation Techniques 3 TAXONOMY OF OBFUSCATION METHODS tures. For example, a programmer manipulating an image would declare a 2 dimension array. Manipulating a one dimension array or a 3 or more dimension array would increase the obscurity significantly of the program. Ordering transformations Randomizing the order of declarations is generally a good idea. That being the ordering of data in arrays or the order of function definition. In the example below we reordered the data in A using a function f. A[100]; A[100]; for (i=1,i<100,i++) for (i=1,i<100,i++) { { A[i] A[f(i)] Layout obfuscation Layout obfuscation gathers all the transformations that change the information included in the code formating. For example, scrambling identifier names or the code indentation are layout obfuscation techniques. Layout transformations often are one-way and free while their potency may vary depending on the transformation Control obfuscation Control obfuscation obscures the program controlflow. Control transformation may affect the aggregation, ordering or computations of the control flow. Control aggregation transformation sparse computations that should stay grouped and groups computations that have nothing in common. Control ordering transformations randomize the order of instructions and computations transformations insert new code or change the algorithms employed in the program. Applying control obfuscation technique often implies slowing down the program. The programmer will have to chose between the highly efficient program he intends to distribute and its highly obfuscated, but slower alternative. Opaque predicates Opaque constructs are predicates or variables that have priority known by the obfuscator but are hard for the deobfuscator to guess. There is a link between the resilience and the cost of an opaque construct and the cost and resilience of the transformation that uses it. Resilience and cost of an opaque construct are measured using the same scale as obfuscating transformations. Definition 6 (Opaque constructs). A variable V is opaque at a point p in a program if V has a property q at p that is known at obfuscation time. We write Vp q A predicate is opaque at a point p in a program if its value (True or False) is known at obfuscation time. We write p T if is True at p, p F if is False at p and p? if is sometimes True and sometimes False at p. An opaque construct is said to be trivial if a deobfuscator can deduce its value by static local analysis and is said to be weak if a static global analysis is required to deduce its value. Inserting dead code Using opaque predicates enables the insertion of irrelevant code, A block of instructions can be put in an if condition of an opaque predicate T and some dead code (ie : code with no actual effect but still hard to understand) could be inserted in the else case of that condition. Another usage of dead code insertion is using a? opaque predicate and insert two version of the same code in the if and the else condition. Then, the version of the code that is run is determined at runtime, and the deobfuscator could take some time to understand the two versions actually have the same effect. We could also use a T opaque construct and two versions of the same code S a if true and S b if false, but the S b version would have some bugs in it (see 2). Extending loop conditions We can use opaque predicate to make loop termination conditions more complex without changing the number of iteration. For example, we could replace a condition C by C&& T. Converting a reducible flow graph to a nonreducible one Using gotos combined with opaque predicates, we can make unused skips in the programm that will make the flow graph unreducible and force the deobfuscator to make an equivalent of the program which flow graph it can reduce. Since gotos introduce ruptures in a program s control flow, deobfuscators canot easily reduce control flow that use many of them. Adding Redundant Operands rovided result accuracy is not of high importance, we can use opaque variables to add redundant

8 3.3 Obfuscation Techniques 3 TAXONOMY OF OBFUSCATION METHODS T T F T? F T T F C C a C b C a C b f(c a ) = f(c b ) f(c b ) f(c a ) Figure 2: Dead code insertion and opaque predicates operands, increasing the program potency. withr =1, =2Q, Q = /2 x=x+y; x=x+y*r; z=w+1; z=w+(/q)/2; arallelizing code arallel programs are not as easy to understand as non parallel ones. Since there is nowadays plenty of tools for parallelism, we can use them to obscure the program control flow. We could create useless threads that would appear to do real work or we could run several independent tasks of the control flow at the same time. Of course, if the computer that runs the program cannot run more than one process at a time, theses transformation will slow down the program. But our goal here is to obfuscate our program, any actual acceleration of our program would be a side effect. When coding, a programmer would group some pieces of code that have common points, write functions Making the code more understandable and easier to maintain. The next transformation will be aggregation transformations that inline or outline code, unroll loops or interleave functions. Inlining or outlining functions Inlining a function implies replacing calls to a function by the function code. Inlining is a one-way resilient transformation, it remove every abstraction set by the presence of the function. Outlining instructions in a function means making a functions that runs theses instructions. One use of outlining for obfuscation is to outline parts of semantically different procedures in a same function. (see 3). l1; l2; lk; m1; m2; mk; <typea> foo(<args>){ lk-1; lk; m1; m2; l1; lk-2; foo(<args>) m3; mk; Interleaving functions Interleaving functions means merging two (or more) functions in one, merging body, arguments and returned results. The resulted function would take another argument that tells the instructions whose initial functions has to be run. Detecting function interleaving is really difficult for reverse engineers since it scrambles the semantics of the functions that were interleaved. Cloning functions For a given function, one writes several functions that have the exact same role and obfuscate each one a different way. Then, each time the function is to be called, the programmer would call one of its clones instead. Since the context of function calls are used to understand the function purpose and since the body of the function is obfuscated, this transformation makes the understanding of the function more difficult.

9 3.4 Deobfuscation 3 TAXONOMY OF OBFUSCATION METHODS Loop transformations Three loop transformations can be enumerated : loop blocking, loop unrolling and loop fission. For each of these transformations, an example is given in figure 3. Loop blocking means partitioning the loop iteration space in smaller loops. This transformation is usually used to make sure the data used in the loop are kept in the CU cache. Loop unrolling means replicating a loop body several times in order to reduce the number of iterations of the loop. This transformation is often used as a preliminary to the parallelization of the loop. Loop fission is a transformation that expands a loop with a compounded body into several loops with the same iteration space. Independently, these transformations have a fairly good potency but have a very low resilience since in most cases, static analysis can counter these transformations. But when these transformations are used together, the program resilience skyrockets. rogrammer s preference is to increase their code locality, making them more understandable. When obfuscating a program, we will want to mix pieces of the code (e.g. declarations of functions, of variables). Such transformations have low potency since they don t obscure the code that much. However, their resilience is one-way in most cases since once the transformation is applied, there is no information about the original order of the mixed pieces. When applying aggregation transformation, one would pay attention to the order according to which the transformations are applied. For example inlining several functions and outlining a block of the resulting code (making sure the outlined block includes instructions from the inlined functions) will be more efficient than outlining a block of code with the same monolithic semantic. building high resilience opaque constructs redicate such as have trivial or weak resilience. Since the resilience of an opaque construct influences the quality of the transformation that uses it, one would like to have high resilience opaque constructs. There are severals methods for building resilient and cheap opaque constructs ([6]). One first method is to use aliasing. Trying to deduce properties from pointers is difficult since they refer to different memory spaces during the program execution. An example of opaque predicate based on aliasing could be *i==*j (the pointers i and j are referring to the same memory space). Another method would be to take advantage of parallel processing of variable. A variable (or a pointer) modified by many threads would make a highly resilient opaque variable as it would be very difficult and time consuming to analyse statically. For example, there is n! way to execute n parallel instructions. 3.4 Deobfuscation A deobfusactor takes a program and simplifies it, removing useless control and data flow. The three main actions are : eliminating dead code determining whether a block of code will be reached or not, eliminating irrelevant variables determining whether the value of a variable is relevant further in the code from a given point and removing aliasing. If in theory, code obfuscation seems inefficient, there is nowadays no actual easy way to deobfuscate a program. Appel ([2]) tackled the matter of a white box obfuscation of a program. This means that the obfuscating program F is perfectly known to the public, but it uses a key K, kept secret, to obfuscate, thus we have : = F (, K). Knowing F the task of the deobfuscator is N-easy : the deobfuscator would run the following steps : Guess a source program S Guess a key K Compute = F (S, K) Check = Therefore white box deobfuscation is N-easy, but this doesn t lead to really useful deobfuscation programs. As we saw previously, some transformations are one-way and many deobfuscators don t take the risk to invert such transformations. And when they do, failure is very usual. Barak et al. tackled the matter of black box obfuscation ([3]). The virtual black box property stipulates that Anything that can be efficiently computed form O( ) can be efficiently computed given oracle access to. Building a special set of unobfuscable functions, Barak et al. proved that black box obfuscation is not

10 3.4 Deobfuscation 3 TAXONOMY OF OBFUSCATION METHODS for (i=1,i<=n,i++); for(j=1,j<=n,j++) a[i,j]=b[j,i]; (loop blocking) for (I=1,I<=n,I+=64) for(j=1,j<=n,j+=64) for(i=i,i<=min(i+63,n),i++) for(j=j,j<=min(j+63,n),j++) a[i,j]=b[j,i]; for (i=2,i<(n-1),i++) a[i] += a[i-1]*a[i+1]; (loop unrolling) for(i=2,i<(n-2),i+=2){ a[i] += a[i-1]*a[i+1]; a[i+1] += a[i]*a[i+2]; if (((n-2) % 2) == 1) a[n-1]+= a[n-2]*a[n]; for(i=1,i<n,i++){ a[i] += c; x[i+i]=d+x[i+1]*a[i]; (loop fission) for(i=1,i<n,i++) a[i] += c; for(i=1,i<n,i++) x[i+i]=d+x[i+1]*a[i]; Figure 3: Loop transformations (from top to bottom): Loop blocking, loop unrolling and loop fission possible, even when using approximate obfuscators (meaning that has a certain probability to give the same result as ). A major function of a deobfuscator is to eliminate bogus code that were inserted using opaque predicates. It is easier for a deobfuscator to identify and evaluate local opaque construct rather than global ones. Colberg et al. [5] listed several techniques to boost a code s resilience towards automatic deobfuscators. If a transformation can be easily reversed by a deobfuscator, we can introduce bogus code based on that transformation, making reversing less obvious. Since some deobfuscators use pattern matching to identify opaque predicate, one can use the same syntax used for the real code for opaque constructs. One could also exploit the flaw in slicing techniques for deobfuscation like introducing aliasing or adding useless variable dependencies, making it harder to identify sliceable blocks. Several deobfuscators use program slicing [14] to reduce the deobfuscation problem into several smaller problems. Usually, adding aliasing of extending variable dependencies make it harder to slice a program. When using static analysis, a deobfuscator can assume a construct to be opaque. But to prove it, the reverse engineer will have to make a mutant version 1 of the program where the assumed opaque construct is set to its assumed value. If and 1 give the same outputs for the same inputs then the assumption was right. Since choosing the correct input values set will be long and difficult (all the paths in the program have to be covered), we would prefer to use? predicate or use interleaved predicate that would have to be solved together at the same time. One could also make data flow analysis harder or

11 6 CONCLUSION force the deobfuscator to prove complex theorem in order to crack an opaque predicate. In theory, code obfuscation is impossible as it was proved, since its deobfuscation is possible. In reality, exploiting flaws of automatic obfuscators can make a program obfuscation be hard enough to crack so that a new version of the program can be computed and obfuscated. 4 Summary of the obfuscation metrics The obfuscation transformation previously listed have been classified by quality by Colberg in his paper [5]. The table 1 summarizes his work. Transformations are classified by target and operation. The quality of each transformation is exposed and the metric(s) used to measure the transformations potency is(are) enumerated according to the notations used sooner in this article. As seen previously, layout transformations offer high resilience for a free cost whereas their potency may vary. Loop transformations have a very low cost but also have a low potency and resilience. arallelizing code offers strong potency and resilience but is also costly. On several cases, the transformation quality depends on the quality of opaque constructs or complexity of a data structure or function. 5 lanned use of EA for obfuscation As in a compiler, choosing transformations to apply and ordering them is a complex task that can take a long time. In common compilers the transformations and their order are fixed according to an average of good performance. Guelton and Varrette combined the source to source compiler IS with evolutionary algorithms ([10]) to seek the best combination of optimization transformations for a given program. Since IS uses configuration files to specify the nature, order and localization of the transformations it applies, the EA can manipulate text based files instead of binaries. Each individual being represented by a set of configuration files that leads to a binary that will be compared to other individuals binary. Comparing the results brought by the evolutionary approach (compared to a complete approach and a glutton approach), they found that the evolutionary approach gave an optimal result like the complete approach in times comparable to the glutton approach that was unable to give an optimal result. Since code obfuscation is a matter of transformations applied in a particular order, we can think of a use of evolutionary algorithm to find the optimal sequence of transformations to apply on a program. Moreover, we could use distributed Evolutionary Algorithm (dea) to deal with obfuscation problem that would be particular long to solve. 6 Conclusion Software obfuscation have many applications, not only being that of protecting a program s secrets (powerful algorithm, extremely efficient data structure ) but also, birthmarking, watermarking or even tamperproofing. Like software compilation, program obfuscation is a matter of transformation that have to be applied in a correct order to provide an optimal result, transformations having different potency and resilience depending on the other transformations they are combined with. Trying each combination, though guaranteeing the finding of an optimal configuration, can take a very long time. That s why the use of evolutionary algorithms and dea can be a good idea for this matter. Since software complexity doesn t have one single metric, the evaluation of the quality of an obfuscation transformation is not simple. Some transformations can be me more or less efficient on different programs and on different programming languages. Moreover, some transformations may not be available (or just be useless) in some languages. Although it has been proved that, in theory, program obfuscation is impossible and inefficient, whether a black box obfuscator or a white box obfuscator is used, today s deobfuscator have many flaws than can be exploited and deobfuscating a program is still long enough to slow down reverse engineers. Hence, it is possible to protect a program secret by obfuscating it while developing a new version, making the cracking of the last program pointless when the new one is available.

12 6 CONCLUSION Obfuscation Quality Target Operation Transformation otency Resilience Cost Layout Control Data Computations Aggregation Ordering Storage & Encoding Aggregation Ordering Scramble Identifiers medium one-way free Change Formatting low one-way free Remove Comments high one-way free Metrics Insert Dead or Irrelevant Code µ Depends on the quality of the opaque 1, µ 2, µ 3 construct and on the nesting depth of its Extend Loop Condition µ insertion 1, µ 2, µ3 Reducible to non- µ 1, µ 2, µ3 Reducible Add Redundant µ 1, µ 2, µ3 Operands arallelize Code high strong costly µ 1, µ 2 Inline Method medium one-way free µ 1 Outline Statements medium strong free µ 1 Interleave Functions Depends on the quality of the opaque µ 1, µ 2, µ 5 Clone Functions predicate µ 1 Block loop low weak free µ 1, µ 2 Unroll loop low weak cheap µ 1 Loop fission low weak free µ 1, µ 2 Reorder Statements low one-way free Reorder Loops low one-way free Reorder Expression low one-way free Change Encoding romote Scalar to Object Change Variable Lifetime Split Variable Depends on the complexity of the encoding function low strong free µ 1 low strong free µ 4 Depends on the number of variables into which the original variable is split Depends on the complexity of the generated function µ 1, µ 2 low weak free µ 1 Convert Static to rocedural Data Merge Scalar Variables Split Array * weak free µ 1, µ 2, µ 6 Merge Arrays * weak free µ 1, µ 2 Fold Array * weak cheap µ 1, µ 2, µ 6, µ 3 Flatten Array * weak free Reorder Functions & low one-way free Variables Reorder Arrays low weak free µ 1 Table 1: Obfuscation transformations and their qualities (Courtesy Colberg et Al). A * in the quality columns indicates that the measure depends on circumstances discussed previously in this article

13 REFERENCES B NOTATIONS References [1] arallélisation interprocédurale de programmes scientifiques (pips). [2] Andrew Appel. Deobfuscation is in N [3] Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil Vadhan, and Ke Yang. On the (im)possibility of obfuscating programs [4] Shyam R. Chidamber and Chris F. Kemerer. A metrics suite for object oriented design [5] Clark Thomborson Christian Collberg and Douglas Low. A taxonomy of obfuscating transformations [6] Douglas Low Christian Collberg, Clark Thomborson. Manufacturing cheap, resilient, ans stealthy opaque constructs [7] Christian Collberg and Jasvir Nagra. Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software rotection. Addison-Wesley rofessional, [8] C. Darwin. The Origin of Species. John Murray, [9] Rémi Triolet François Irigoin, ierre Jouvelot. Semantical interprocedural parallelization [10] Serge Guelton and Sébastien Varrette. Une approche génétique et source à source de l optimisation de code [11] Thomas McCabe. A complexity measure [12] N Melab, S Cahon, and E Talbi. Grid computing for parallel bioinspired algorithms. Journal of arallel and Distributed Computing, 66(8): , August [13] Leonardo Trujillo and Gustavo Olague. Automated design of image operators that detect interest points. Evolutionary computation, 16(4): , January [14] Mark Weiser. rogram slicing A Acronym used dea distributed Evolutionary Algorithm EA Evolutionary Algorithm EAs Evolutionary Algorithms GA Genetic Algorithm G Genetic rogramming B Notations Do a table of notations Ex: = rogram = rogram obfuscate

14 B NOTATIONS τ pot ( ) = potency of the transformation from to τ res ( ) = resilience of the transformation from to τ cost ( ) = cost of the transformation from to τ qual ( ) = quality of the transformation from to E() = the complexity of,

Code Obfuscation. Mayur Kamat Nishant Kumar

Code Obfuscation. Mayur Kamat Nishant Kumar Code Obfuscation Mayur Kamat Nishant Kumar Agenda Malicious Host Problem Code Obfuscation Watermarking and Tamper Proofing Market solutions Traditional Network Security Problem Hostile Network Malicious

More information

Using Evolutionary Algorithms to obfuscate code

Using Evolutionary Algorithms to obfuscate code Using Evolutionary Algorithms to obfuscate code Benoît Bertholon 1, Sébastien Varrette 2 et Pascal Bouvry 2 1 Security and Trust (SnT) interdisciplinary center, 2 Computer Science and Communication (CSC)

More information

Applications of obfuscation to software and hardware systems

Applications of obfuscation to software and hardware systems Applications of obfuscation to software and hardware systems Victor P. Ivannikov Institute for System Programming Russian Academy of Sciences (ISP RAS) www.ispras.ru Program obfuscation is an efficient

More information

Software Protection through Code Obfuscation

Software Protection through Code Obfuscation Software Protection through Code Obfuscation Dissertation submitted in partial fulfillment of the requirements for the degree of Master of Technology, Computer Engineering by Aniket Kulkarni Roll No: 121022016

More information

Obfuscation: know your enemy

Obfuscation: know your enemy Obfuscation: know your enemy Ninon EYROLLES neyrolles@quarkslab.com Serge GUELTON sguelton@quarkslab.com Prelude Prelude Plan 1 Introduction What is obfuscation? 2 Control flow obfuscation 3 Data flow

More information

Introduction to Program Obfuscation

Introduction to Program Obfuscation Introduction to Program Obfuscation p. 1/26 Introduction to Program Obfuscation Yury Lifshits Saint-Petersburg State University http://logic.pdmi.ras.ru/ yura/ yura@logic.pdmi.ras.ru Introduction to Program

More information

Implementation of an Obfuscation Tool for C/C++ Source Code Protection on the XScale Architecture *

Implementation of an Obfuscation Tool for C/C++ Source Code Protection on the XScale Architecture * Implementation of an Obfuscation Tool for C/C++ Source Code Protection on the XScale Architecture * Seongje Cho, Hyeyoung Chang, and Yookun Cho 1 Dept. of Computer Science & Engineering, Dankook University,

More information

Code Obfuscation Literature Survey

Code Obfuscation Literature Survey Code Obfuscation Literature Survey Arini Balakrishnan, Chloe Schulze CS701 Construction of Compilers, Instructor: Charles Fischer Computer Sciences Department University of Wisconsin, Madison December

More information

Software Code Protection Through Software Obfuscation

Software Code Protection Through Software Obfuscation Software Code Protection Through Software Obfuscation Presented by: Sabu Emmanuel, PhD School of Computer Engineering Nanyang Technological University, Singapore E-mail: asemmanuel@ntu.edu.sg 20, Mar,

More information

Lecture 12: Software protection techniques. Software piracy protection Protection against reverse engineering of software

Lecture 12: Software protection techniques. Software piracy protection Protection against reverse engineering of software Lecture topics Software piracy protection Protection against reverse engineering of software Software piracy Report by Business Software Alliance for 2001: Global economic impact of software piracy was

More information

Surreptitious Software

Surreptitious Software Surreptitious Software Obfuscation, Watermarking, and Tamperproofing for Software Protection Christian Collberg Jasvir Nagra rw T Addison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco

More information

Static Analysis of Virtualization- Obfuscated Binaries

Static Analysis of Virtualization- Obfuscated Binaries Static Analysis of Virtualization- Obfuscated Binaries Johannes Kinder School of Computer and Communication Sciences École Polytechnique Fédérale de Lausanne (EPFL), Switzerland Virtualization Obfuscation

More information

DATA OBFUSCATION. What is data obfuscation?

DATA OBFUSCATION. What is data obfuscation? DATA OBFUSCATION What data obfuscation? Data obfuscations break the data structures used in the program and encrypt literals. Th method includes modifying inheritance relations, restructuring arrays, etc.

More information

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters Interpreters and virtual machines Michel Schinz 2007 03 23 Interpreters Interpreters Why interpreters? An interpreter is a program that executes another program, represented as some kind of data-structure.

More information

2) Write in detail the issues in the design of code generator.

2) Write in detail the issues in the design of code generator. COMPUTER SCIENCE AND ENGINEERING VI SEM CSE Principles of Compiler Design Unit-IV Question and answers UNIT IV CODE GENERATION 9 Issues in the design of code generator The target machine Runtime Storage

More information

CIS570 Modern Programming Language Implementation. Office hours: TDB 605 Levine eclewis@cis.upenn.edu. cherylh@central.cis.upenn.

CIS570 Modern Programming Language Implementation. Office hours: TDB 605 Levine eclewis@cis.upenn.edu. cherylh@central.cis.upenn. CIS570 Modern Programming Language Implementation Instructor: Admin. Assistant: URL: E Christopher Lewis Office hours: TDB 605 Levine eclewis@cis.upenn.edu Cheryl Hickey cherylh@central.cis.upenn.edu 502

More information

Lecture 4 on Obfuscation by Partial Evaluation of Distorted Interpreters

Lecture 4 on Obfuscation by Partial Evaluation of Distorted Interpreters Lecture 4 on Obfuscation by Partial Evaluation of Distorted Interpreters Neil D. Jones DIKU, University of Copenhagen (prof. emeritus) Joint work with Roberto Giacobazzi and Isabella Mastroeni University

More information

How To Develop Software

How To Develop Software Software Engineering Prof. N.L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture-4 Overview of Phases (Part - II) We studied the problem definition phase, with which

More information

Moving from CS 61A Scheme to CS 61B Java

Moving from CS 61A Scheme to CS 61B Java Moving from CS 61A Scheme to CS 61B Java Introduction Java is an object-oriented language. This document describes some of the differences between object-oriented programming in Scheme (which we hope you

More information

Specifying Imperative Data Obfuscations

Specifying Imperative Data Obfuscations Specifying Imperative Data Obfuscations Stephen Drape, Clark Thomborson and Anirban Majumdar Department of Computer Science, The University of Auckland, New Zealand. Abstract. An obfuscation aims to transform

More information

IT UNIVERSITY OF COPENHAGEN. Abstract. Department of Software Development and Technology (SDT) Master s Thesis. Generic deobfuscator for Java

IT UNIVERSITY OF COPENHAGEN. Abstract. Department of Software Development and Technology (SDT) Master s Thesis. Generic deobfuscator for Java IT UNIVERSITY OF COPENHAGEN Abstract Department of Software Development and Technology (SDT) Master s Thesis Generic deobfuscator for Java by Mikkel B. Nielsen Obfuscation is a tool used to enhance the

More information

Design of Java Obfuscator MANGINS++ - A novel tool to secure code

Design of Java Obfuscator MANGINS++ - A novel tool to secure code J. Comp. & Math. Sci. Vol. 1 (6), 646-652 (2010) Design of Java Obfuscator MANGINS++ - A novel tool to secure code HARSHA VARDHAN RAJENDRAN, CH KALYAN CHANDRA and R. SENTHIL KUMAR School of Computing Sciences

More information

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz 2007 03 16

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz 2007 03 16 Advanced compiler construction Michel Schinz 2007 03 16 General course information Teacher & assistant Course goals Teacher: Michel Schinz Michel.Schinz@epfl.ch Assistant: Iulian Dragos INR 321, 368 64

More information

Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus

Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus A simple C/C++ language extension construct for data parallel operations Robert Geva robert.geva@intel.com Introduction Intel

More information

Automated Program Behavior Analysis

Automated Program Behavior Analysis Automated Program Behavior Analysis Stacy Prowell sprowell@cs.utk.edu March 2005 SQRL / SEI Motivation: Semantics Development: Most engineering designs are subjected to extensive analysis; software is

More information

[Refer Slide Time: 05:10]

[Refer Slide Time: 05:10] Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture

More information

Java Obfuscation Salah Malik BSc Computer Science 2001/2002

Java Obfuscation Salah Malik BSc Computer Science 2001/2002 Java Obfuscation Salah Malik BSc Computer Science 2001/2002 Summary Java has become a popular language in both academia and industry. Its strength lies in the "Write Once Run Anywhere" paradigm. This is

More information

A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation

A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation PLDI 03 A Static Analyzer for Large Safety-Critical Software B. Blanchet, P. Cousot, R. Cousot, J. Feret L. Mauborgne, A. Miné, D. Monniaux,. Rival CNRS École normale supérieure École polytechnique Paris

More information

14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,

More information

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file? Files What s it all about? Information being stored about anything important to the business/individual keeping the files. The simple concepts used in the operation of manual files are often a good guide

More information

C Compiler Targeting the Java Virtual Machine

C Compiler Targeting the Java Virtual Machine C Compiler Targeting the Java Virtual Machine Jack Pien Senior Honors Thesis (Advisor: Javed A. Aslam) Dartmouth College Computer Science Technical Report PCS-TR98-334 May 30, 1998 Abstract One of the

More information

Second year review WP2 overview SW-based Method. Trento - October 17th, 2008

Second year review WP2 overview SW-based Method. Trento - October 17th, 2008 Second year review WP2 overview SW-based Method Trento - October 17th, 2008 1 Goal To investigate software-only methodologies for remote entrusting implementation 2 Tasks D2.3 D2.4 M0 M3 M6 M9 M12 M15

More information

7.1 Our Current Model

7.1 Our Current Model Chapter 7 The Stack In this chapter we examine what is arguably the most important abstract data type in computer science, the stack. We will see that the stack ADT and its implementation are very simple.

More information

Module 10. Coding and Testing. Version 2 CSE IIT, Kharagpur

Module 10. Coding and Testing. Version 2 CSE IIT, Kharagpur Module 10 Coding and Testing Lesson 23 Code Review Specific Instructional Objectives At the end of this lesson the student would be able to: Identify the necessity of coding standards. Differentiate between

More information

Factoring & Primality

Factoring & Primality Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

More information

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Journal of Al-Nahrain University Vol.15 (2), June, 2012, pp.161-168 Science Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Manal F. Younis Computer Department, College

More information

Different Approaches to White Box Testing Technique for Finding Errors

Different Approaches to White Box Testing Technique for Finding Errors Different Approaches to White Box Testing Technique for Finding Errors Mohd. Ehmer Khan Department of Information Technology Al Musanna College of Technology, Sultanate of Oman ehmerkhan@gmail.com Abstract

More information

LASTLINE WHITEPAPER. Why Anti-Virus Solutions Based on Static Signatures Are Easy to Evade

LASTLINE WHITEPAPER. Why Anti-Virus Solutions Based on Static Signatures Are Easy to Evade LASTLINE WHITEPAPER Why Anti-Virus Solutions Based on Static Signatures Are Easy to Evade Abstract Malicious code is an increasingly important problem that threatens the security of computer systems. The

More information

Software Reverse Engineering

Software Reverse Engineering Software Reverse Engineering Jacco Krijnen June 19, 2013 Abstract While reverse engineering probably started with the analysis of hardware, today it plays a significant role in the software world. We discuss

More information

Habanero Extreme Scale Software Research Project

Habanero Extreme Scale Software Research Project Habanero Extreme Scale Software Research Project Comp215: Java Method Dispatch Zoran Budimlić (Rice University) Always remember that you are absolutely unique. Just like everyone else. - Margaret Mead

More information

Guaranteed Slowdown, Generalized Encryption Scheme, and Function Sharing

Guaranteed Slowdown, Generalized Encryption Scheme, and Function Sharing Guaranteed Slowdown, Generalized Encryption Scheme, and Function Sharing Yury Lifshits July 10, 2005 Abstract The goal of the paper is to construct mathematical abstractions of different aspects of real

More information

Efficient Data Structures for Decision Diagrams

Efficient Data Structures for Decision Diagrams Artificial Intelligence Laboratory Efficient Data Structures for Decision Diagrams Master Thesis Nacereddine Ouaret Professor: Supervisors: Boi Faltings Thomas Léauté Radoslaw Szymanek Contents Introduction...

More information

Sources: On the Web: Slides will be available on:

Sources: On the Web: Slides will be available on: C programming Introduction The basics of algorithms Structure of a C code, compilation step Constant, variable type, variable scope Expression and operators: assignment, arithmetic operators, comparison,

More information

Chapter 12 Programming Concepts and Languages

Chapter 12 Programming Concepts and Languages Chapter 12 Programming Concepts and Languages Chapter 12 Programming Concepts and Languages Paradigm Publishing, Inc. 12-1 Presentation Overview Programming Concepts Problem-Solving Techniques The Evolution

More information

Copy protection through software watermarking and obfuscation

Copy protection through software watermarking and obfuscation Copy protection through software watermarking and obfuscation GERGELY EBERHARDT, ZOLTÁN NAGY SEARCH-LAB Ltd., {gergely.eberhardt, zoltan.nagy}@search-lab.hu ERNÔ JEGES, ZOLTÁN HORNÁK BME, Department of

More information

Component visualization methods for large legacy software in C/C++

Component visualization methods for large legacy software in C/C++ Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University mcserep@caesar.elte.hu

More information

Pseudo code Tutorial and Exercises Teacher s Version

Pseudo code Tutorial and Exercises Teacher s Version Pseudo code Tutorial and Exercises Teacher s Version Pseudo-code is an informal way to express the design of a computer program or an algorithm in 1.45. The aim is to get the idea quickly and also easy

More information

Persistent Binary Search Trees

Persistent Binary Search Trees Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents

More information

AUTOMATED TEST GENERATION FOR SOFTWARE COMPONENTS

AUTOMATED TEST GENERATION FOR SOFTWARE COMPONENTS TKK Reports in Information and Computer Science Espoo 2009 TKK-ICS-R26 AUTOMATED TEST GENERATION FOR SOFTWARE COMPONENTS Kari Kähkönen ABTEKNILLINEN KORKEAKOULU TEKNISKA HÖGSKOLAN HELSINKI UNIVERSITY OF

More information

ALLIED PAPER : DISCRETE MATHEMATICS (for B.Sc. Computer Technology & B.Sc. Multimedia and Web Technology)

ALLIED PAPER : DISCRETE MATHEMATICS (for B.Sc. Computer Technology & B.Sc. Multimedia and Web Technology) ALLIED PAPER : DISCRETE MATHEMATICS (for B.Sc. Computer Technology & B.Sc. Multimedia and Web Technology) Subject Description: This subject deals with discrete structures like set theory, mathematical

More information

Evolutionary SAT Solver (ESS)

Evolutionary SAT Solver (ESS) Ninth LACCEI Latin American and Caribbean Conference (LACCEI 2011), Engineering for a Smart Planet, Innovation, Information Technology and Computational Tools for Sustainable Development, August 3-5, 2011,

More information

NP-Completeness I. Lecture 19. 19.1 Overview. 19.2 Introduction: Reduction and Expressiveness

NP-Completeness I. Lecture 19. 19.1 Overview. 19.2 Introduction: Reduction and Expressiveness Lecture 19 NP-Completeness I 19.1 Overview In the past few lectures we have looked at increasingly more expressive problems that we were able to solve using efficient algorithms. In this lecture we introduce

More information

Organization of Programming Languages CS320/520N. Lecture 05. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.

Organization of Programming Languages CS320/520N. Lecture 05. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio. Organization of Programming Languages CS320/520N Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Names, Bindings, and Scopes A name is a symbolic identifier used

More information

Random Fibonacci-type Sequences in Online Gambling

Random Fibonacci-type Sequences in Online Gambling Random Fibonacci-type Sequences in Online Gambling Adam Biello, CJ Cacciatore, Logan Thomas Department of Mathematics CSUMS Advisor: Alfa Heryudono Department of Mathematics University of Massachusetts

More information

Glossary of Object Oriented Terms

Glossary of Object Oriented Terms Appendix E Glossary of Object Oriented Terms abstract class: A class primarily intended to define an instance, but can not be instantiated without additional methods. abstract data type: An abstraction

More information

The programming language C. sws1 1

The programming language C. sws1 1 The programming language C sws1 1 The programming language C invented by Dennis Ritchie in early 1970s who used it to write the first Hello World program C was used to write UNIX Standardised as K&C (Kernighan

More information

Coverability for Parallel Programs

Coverability for Parallel Programs 2015 http://excel.fit.vutbr.cz Coverability for Parallel Programs Lenka Turoňová* Abstract We improve existing method for the automatic verification of systems with parallel running processes. The technique

More information

Chapter 5 Names, Bindings, Type Checking, and Scopes

Chapter 5 Names, Bindings, Type Checking, and Scopes Chapter 5 Names, Bindings, Type Checking, and Scopes Chapter 5 Topics Introduction Names Variables The Concept of Binding Type Checking Strong Typing Scope Scope and Lifetime Referencing Environments Named

More information

Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case. Algorithms Efficiency of algorithms Computational resources: time and space Best, worst and average case performance How to compare algorithms: machine-independent measure of efficiency Growth rate Complexity

More information

EVALUATING METRICS AT CLASS AND METHOD LEVEL FOR JAVA PROGRAMS USING KNOWLEDGE BASED SYSTEMS

EVALUATING METRICS AT CLASS AND METHOD LEVEL FOR JAVA PROGRAMS USING KNOWLEDGE BASED SYSTEMS EVALUATING METRICS AT CLASS AND METHOD LEVEL FOR JAVA PROGRAMS USING KNOWLEDGE BASED SYSTEMS Umamaheswari E. 1, N. Bhalaji 2 and D. K. Ghosh 3 1 SCSE, VIT Chennai Campus, Chennai, India 2 SSN College of

More information

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #10 Symmetric Key Ciphers (Refer

More information

Notes on Factoring. MA 206 Kurt Bryan

Notes on Factoring. MA 206 Kurt Bryan The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor

More information

Technical paper review. Program visualization and explanation for novice C programmers by Matthew Heinsen Egan and Chris McDonald.

Technical paper review. Program visualization and explanation for novice C programmers by Matthew Heinsen Egan and Chris McDonald. Technical paper review Program visualization and explanation for novice C programmers by Matthew Heinsen Egan and Chris McDonald Garvit Pahal Indian Institute of Technology, Kanpur October 28, 2014 Garvit

More information

Object Oriented Software Design

Object Oriented Software Design Object Oriented Software Design Introduction to Java - II Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa September 14, 2011 G. Lipari (Scuola Superiore Sant Anna) Introduction

More information

What is Software Watermarking? Software Watermarking Through Register Allocation: Implementation, Analysis, and Attacks

What is Software Watermarking? Software Watermarking Through Register Allocation: Implementation, Analysis, and Attacks hat is Software atermarking? Software atermarking Through Register Allocation: Implementation, Analysis, and Attacks Ginger Myles Christian Collberg {mylesg,collberg}@cs.arizona.edu University of Arizona

More information

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks

More information

Object Oriented Software Design

Object Oriented Software Design Object Oriented Software Design Introduction to Java - II Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa October 28, 2010 G. Lipari (Scuola Superiore Sant Anna) Introduction

More information

All Your Code Belongs To Us Dismantling Android Secrets With CodeInspect. Steven Arzt. 04.10.2015 Secure Software Engineering Group Steven Arzt 1

All Your Code Belongs To Us Dismantling Android Secrets With CodeInspect. Steven Arzt. 04.10.2015 Secure Software Engineering Group Steven Arzt 1 All Your Code Belongs To Us Dismantling Android Secrets With CodeInspect Steven Arzt 04.10.2015 Secure Software Engineering Group Steven Arzt 1 04.10.2015 Secure Software Engineering Group Steven Arzt

More information

Mining a Change-Based Software Repository

Mining a Change-Based Software Repository Mining a Change-Based Software Repository Romain Robbes Faculty of Informatics University of Lugano, Switzerland 1 Introduction The nature of information found in software repositories determines what

More information

A Test Suite for Basic CWE Effectiveness. Paul E. Black. paul.black@nist.gov. http://samate.nist.gov/

A Test Suite for Basic CWE Effectiveness. Paul E. Black. paul.black@nist.gov. http://samate.nist.gov/ A Test Suite for Basic CWE Effectiveness Paul E. Black paul.black@nist.gov http://samate.nist.gov/ Static Analysis Tool Exposition (SATE V) News l We choose test cases by end of May l Tool output uploaded

More information

Debugging. Common Semantic Errors ESE112. Java Library. It is highly unlikely that you will write code that will work on the first go

Debugging. Common Semantic Errors ESE112. Java Library. It is highly unlikely that you will write code that will work on the first go Debugging ESE112 Java Programming: API, Psuedo-Code, Scope It is highly unlikely that you will write code that will work on the first go Bugs or errors Syntax Fixable if you learn to read compiler error

More information

River Dell Regional School District. Computer Programming with Python Curriculum

River Dell Regional School District. Computer Programming with Python Curriculum River Dell Regional School District Computer Programming with Python Curriculum 2015 Mr. Patrick Fletcher Superintendent River Dell Regional Schools Ms. Lorraine Brooks Principal River Dell High School

More information

Obfuscation of Abstract Data-Types

Obfuscation of Abstract Data-Types Obfuscation of Abstract Data-Types Stephen Drape St John s College Thesis submitted for the degree of Doctor of Philosophy at the University of Oxford Trinity Term 2004 Obfuscation of Abstract Data Types

More information

Memory Systems. Static Random Access Memory (SRAM) Cell

Memory Systems. Static Random Access Memory (SRAM) Cell Memory Systems This chapter begins the discussion of memory systems from the implementation of a single bit. The architecture of memory chips is then constructed using arrays of bit implementations coupled

More information

CSC408H Lecture Notes

CSC408H Lecture Notes CSC408H Lecture Notes These lecture notes are provided for the personal use of students taking Software Engineering course in the Summer term 2005 at the University of Toronto. Copying for purposes other

More information

each college c i C has a capacity q i - the maximum number of students it will admit

each college c i C has a capacity q i - the maximum number of students it will admit n colleges in a set C, m applicants in a set A, where m is much larger than n. each college c i C has a capacity q i - the maximum number of students it will admit each college c i has a strict order i

More information

G563 Quantitative Paleontology. SQL databases. An introduction. Department of Geological Sciences Indiana University. (c) 2012, P.

G563 Quantitative Paleontology. SQL databases. An introduction. Department of Geological Sciences Indiana University. (c) 2012, P. SQL databases An introduction AMP: Apache, mysql, PHP This installations installs the Apache webserver, the PHP scripting language, and the mysql database on your computer: Apache: runs in the background

More information

Network (Tree) Topology Inference Based on Prüfer Sequence

Network (Tree) Topology Inference Based on Prüfer Sequence Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,

More information

CS/COE 1501 http://cs.pitt.edu/~bill/1501/

CS/COE 1501 http://cs.pitt.edu/~bill/1501/ CS/COE 1501 http://cs.pitt.edu/~bill/1501/ Lecture 01 Course Introduction Meta-notes These notes are intended for use by students in CS1501 at the University of Pittsburgh. They are provided free of charge

More information

Counter Expertise Review on the TNO Security Analysis of the Dutch OV-Chipkaart. OV-Chipkaart Security Issues Tutorial for Non-Expert Readers

Counter Expertise Review on the TNO Security Analysis of the Dutch OV-Chipkaart. OV-Chipkaart Security Issues Tutorial for Non-Expert Readers Counter Expertise Review on the TNO Security Analysis of the Dutch OV-Chipkaart OV-Chipkaart Security Issues Tutorial for Non-Expert Readers The current debate concerning the OV-Chipkaart security was

More information

THE CERN/SL XDATAVIEWER: AN INTERACTIVE GRAPHICAL TOOL FOR DATA VISUALIZATION AND EDITING

THE CERN/SL XDATAVIEWER: AN INTERACTIVE GRAPHICAL TOOL FOR DATA VISUALIZATION AND EDITING THE CERN/SL XDATAVIEWER: AN INTERACTIVE GRAPHICAL TOOL FOR DATA VISUALIZATION AND EDITING Abstract G. Morpurgo, CERN As a result of many years of successive refinements, the CERN/SL Xdataviewer tool has

More information

Chapter 6: Programming Languages

Chapter 6: Programming Languages Chapter 6: Programming Languages Computer Science: An Overview Eleventh Edition by J. Glenn Brookshear Copyright 2012 Pearson Education, Inc. Chapter 6: Programming Languages 6.1 Historical Perspective

More information

Introduction to Compiler Consultant

Introduction to Compiler Consultant Application Report SPRAA14 April 2004 Introduction to Compiler Consultant George Mock Software Development Systems ABSTRACT C and C++ are very powerful and expressive programming languages. Even so, these

More information

A Brief Study of the Nurse Scheduling Problem (NSP)

A Brief Study of the Nurse Scheduling Problem (NSP) A Brief Study of the Nurse Scheduling Problem (NSP) Lizzy Augustine, Morgan Faer, Andreas Kavountzis, Reema Patel Submitted Tuesday December 15, 2009 0. Introduction and Background Our interest in the

More information

Curriculum Map. Discipline: Computer Science Course: C++

Curriculum Map. Discipline: Computer Science Course: C++ Curriculum Map Discipline: Computer Science Course: C++ August/September: How can computer programs make problem solving easier and more efficient? In what order does a computer execute the lines of code

More information

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2). CHAPTER 5 The Tree Data Model There are many situations in which information has a hierarchical or nested structure like that found in family trees or organization charts. The abstraction that models hierarchical

More information

Genetic Algorithms commonly used selection, replacement, and variation operators Fernando Lobo University of Algarve

Genetic Algorithms commonly used selection, replacement, and variation operators Fernando Lobo University of Algarve Genetic Algorithms commonly used selection, replacement, and variation operators Fernando Lobo University of Algarve Outline Selection methods Replacement methods Variation operators Selection Methods

More information

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015 CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis Linda Shapiro Today Registration should be done. Homework 1 due 11:59 pm next Wednesday, January 14 Review math essential

More information

Chapter 1. Dr. Chris Irwin Davis Email: cid021000@utdallas.edu Phone: (972) 883-3574 Office: ECSS 4.705. CS-4337 Organization of Programming Languages

Chapter 1. Dr. Chris Irwin Davis Email: cid021000@utdallas.edu Phone: (972) 883-3574 Office: ECSS 4.705. CS-4337 Organization of Programming Languages Chapter 1 CS-4337 Organization of Programming Languages Dr. Chris Irwin Davis Email: cid021000@utdallas.edu Phone: (972) 883-3574 Office: ECSS 4.705 Chapter 1 Topics Reasons for Studying Concepts of Programming

More information

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 11 Block Cipher Standards (DES) (Refer Slide

More information

Comprehensive Static Analysis Using Polyspace Products. A Solution to Today s Embedded Software Verification Challenges WHITE PAPER

Comprehensive Static Analysis Using Polyspace Products. A Solution to Today s Embedded Software Verification Challenges WHITE PAPER Comprehensive Static Analysis Using Polyspace Products A Solution to Today s Embedded Software Verification Challenges WHITE PAPER Introduction Verification of embedded software is a difficult task, made

More information

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding Compiling Object Oriented Languages What is an Object-Oriented Programming Language? Last time Dynamic compilation Today Introduction to compiling object oriented languages What are the issues? Objects

More information

Problem Solving Basics and Computer Programming

Problem Solving Basics and Computer Programming Problem Solving Basics and Computer Programming A programming language independent companion to Roberge/Bauer/Smith, "Engaged Learning for Programming in C++: A Laboratory Course", Jones and Bartlett Publishers,

More information

Computer Programming I

Computer Programming I Computer Programming I COP 2210 Syllabus Spring Semester 2012 Instructor: Greg Shaw Office: ECS 313 (Engineering and Computer Science Bldg) Office Hours: Tuesday: 2:50 4:50, 7:45 8:30 Thursday: 2:50 4:50,

More information

Programming Languages & Tools

Programming Languages & Tools 4 Programming Languages & Tools Almost any programming language one is familiar with can be used for computational work (despite the fact that some people believe strongly that their own favorite programming

More information

Static IP Routing and Aggregation Exercises

Static IP Routing and Aggregation Exercises Politecnico di Torino Static IP Routing and Aggregation xercises Fulvio Risso August 0, 0 Contents I. Methodology 4. Static routing and routes aggregation 5.. Main concepts........................................

More information

Scaling up = getting a better machine. Scaling out = use another server and add it to your cluster.

Scaling up = getting a better machine. Scaling out = use another server and add it to your cluster. MongoDB 1. Introduction MongoDB is a document-oriented database, not a relation one. It replaces the concept of a row with a document. This makes it possible to represent complex hierarchical relationships

More information

Lumousoft Visual Programming Language and its IDE

Lumousoft Visual Programming Language and its IDE Lumousoft Visual Programming Language and its IDE Xianliang Lu Lumousoft Inc. Waterloo Ontario Canada Abstract - This paper presents a new high-level graphical programming language and its IDE (Integration

More information

COLLEGE ALGEBRA. Paul Dawkins

COLLEGE ALGEBRA. Paul Dawkins COLLEGE ALGEBRA Paul Dawkins Table of Contents Preface... iii Outline... iv Preliminaries... Introduction... Integer Exponents... Rational Exponents... 9 Real Exponents...5 Radicals...6 Polynomials...5

More information