Design of Java Obfuscator MANGINS++ - A novel tool to secure code



Similar documents
Introduction to Program Obfuscation

Code Obfuscation. Mayur Kamat Nishant Kumar

Guaranteed Slowdown, Generalized Encryption Scheme, and Function Sharing

How To Ensure Correctness Of Data In The Cloud

Applications of obfuscation to software and hardware systems

Secret Communication through Web Pages Using Special Space Codes in HTML Files

1. Define: (a) Variable, (b) Constant, (c) Type, (d) Enumerated Type, (e) Identifier.

Keywords: Regression testing, database applications, and impact analysis. Abstract. 1 Introduction

Component visualization methods for large legacy software in C/C++

DIGITAL RIGHTS MANAGEMENT SYSTEM FOR MULTIMEDIA FILES

Database Programming with PL/SQL: Learning Objectives

CHAPTER 5 INTELLIGENT TECHNIQUES TO PREVENT SQL INJECTION ATTACKS

Project Proposal. Data Storage / Retrieval with Access Control, Security and Pre-Fetching

Genetic Algorithms and Sudoku

How To Use Pretty Good Privacy (Pgp) For A Secure Communication

2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India

Data Refinery with Big Data Aspects

Subject : April Referral G 03 / 08 Case G03/08 Referral under Article 112(1)b) EPC ("Patentability of programs for computers")

Mobile Cloud Computing In Business

Software Protection through Code Obfuscation

Monalisa P. Kini, Kavita V. Sonawane, Shamsuddin S. Khan

Bisecting K-Means for Clustering Web Log data

A Tokenization and Encryption based Multi-Layer Architecture to Detect and Prevent SQL Injection Attack

A UPS Framework for Providing Privacy Protection in Personalized Web Search

Appendix B Data Quality Dimensions

ecommerce Web-Site Trust Assessment Framework Based on Web Mining Approach

An in-building multi-server cloud system based on shortest Path algorithm depending on the distance and measured Signal strength

Software License Management using the Polymorphic Encryption Algorithm White Paper

MULTIFACTOR AUTHENTICATION FOR SOFTWARE PROTECTION

Lecture 12: Software protection techniques. Software piracy protection Protection against reverse engineering of software

Monitoring Data Integrity while using TPA in Cloud Environment

SECURING ENTERPRISE NETWORK 3 LAYER APPROACH FOR BYOD

TEACHER S GUIDE TO RUSH HOUR

Index Terms: Cloud Computing, Third Party Auditor, Threats In Cloud Computing, Dynamic Encryption.

An Introduction to. Metrics. used during. Software Development

Software Reverse Engineering

Comparing Methods to Identify Defect Reports in a Change Management Database

TCM: Transactional Completeness Measure based vulnerability analysis for Business Intelligence Support

HYPERION SMART VIEW FOR OFFICE RELEASE NEW FEATURES CONTENTS IN BRIEF. General... 2 Essbase... 3 Planning... 4 Reporting and Analysis...

On the features and challenges of security and privacy in distributed internet of things. C. Anurag Varma CpE /24/2016

Dynamic Query Updation for User Authentication in cloud Environment

Tufts University. Department of Computer Science. COMP 116 Introduction to Computer Security Fall 2014 Final Project. Guocui Gao

ISSN: A Review: Image Retrieval Using Web Multimedia Mining

Security in Android apps

Efficient Data Structures for Decision Diagrams

Sampling and Sampling Distributions

Remote Usability Evaluation of Mobile Web Applications

SOFTWARE REQUIREMENTS

Chapter 23. Database Security. Security Issues. Database Security

Cloud SQL Security. Swati Srivastava 1 and Meenu 2. Engineering College., Gorakhpur, U.P. Gorakhpur, U.P. Abstract

Adi Armoni Tel-Aviv University, Israel. Abstract

UTILIZING COMPOUND TERM PROCESSING TO ADDRESS RECORDS MANAGEMENT CHALLENGES

Privacy preserving technique to secure cloud

CATIA V5 Surface Design

Index Terms Cloud Storage Services, data integrity, dependable distributed storage, data dynamics, Cloud Computing.

How To Protect Your Network From Intrusions From A Malicious Computer (Malware) With A Microsoft Network Security Platform)

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network

Enhancing Cloud Security By: Gotcha (Generating Panoptic Turing Tests to Tell Computers and Human Aparts)

Interactive Information Visualization of Trend Information

A Knowledge Base Representing Porter's Five Forces Model

Implementing a Security Management System: An Outline

PSG College of Technology, Coimbatore Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

WHITE PAPER SPLUNK SOFTWARE AS A SIEM

International Journal of Advanced Computer Technology (IJACT) ISSN: PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS

MULTI-DIMENSIONAL PASSWORD GENERATION TECHNIQUE FOR ACCESSING CLOUD SERVICES

k-gram Based Software Birthmarks

Chapter 6: Fundamental Cloud Security

Using Real Time Computer Vision Algorithms in Automatic Attendance Management Systems

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

A Secure Decentralized Access Control Scheme for Data stored in Clouds

Procedure for Marine Traffic Simulation with AIS Data

Author Gender Identification of English Novels

ISSN: (Online) Volume 2, Issue 1, January 2014 International Journal of Advance Research in Computer Science and Management Studies

Voluntary Product Accessibility Template Blackboard Learn Release 9.1 April 2014 (Published April 30, 2014)

4 The M/M/1 queue. 4.1 Time-dependent behaviour

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

Novel Data Extraction Language for Structured Log Analysis

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

What is Records Management?

Paillier Threshold Encryption Toolbox

The Open University s repository of research publications and other research outputs

Using Feedback Tags and Sentiment Analysis to Generate Sharable Learning Resources

Oracle Sales Cloud Reporting and Analytics Overview. Release 13.2 Part Number E January 2014

Computer Networks. Network Security and Ethics. Week 14. College of Information Science and Engineering Ritsumeikan University

AN INTELLIGENT TUTORING SYSTEM FOR LEARNING DESIGN PATTERNS

DBaaS Using HL7 Based on XMDR-DAI for Medical Information Sharing in Cloud

New Features in Primavera P6 EPPM 16.1

Impact of Service Oriented Architecture on ERP Implementations in Technical Education

Bluetooth Messenger: an Android Messenger app based on Bluetooth Connectivity

Enterprise Content Management. A White Paper. SoluSoft, Inc.

Design and Experiments of small DDoS Defense System using Traffic Deflecting in Autonomous System

The Travel and Expense Management Guide for 2014

Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware

Financial Trading System using Combination of Textual and Numerical Data

Digital Evidence Search Kit

Chapter 6 Experiment Process

Transcription:

J. Comp. & Math. Sci. Vol. 1 (6), 646-652 (2010) Design of Java Obfuscator MANGINS++ - A novel tool to secure code HARSHA VARDHAN RAJENDRAN, CH KALYAN CHANDRA and R. SENTHIL KUMAR School of Computing Sciences and Engineering, VIT University, India. ABSTRACT Security is a vital part of any software. Software piracy has been growing at an exponential rate and had contributed to over $11 billion loss to the software industry during 2006. Although Software Security can be enforced legally and through techniques like Code Authentication, Server Side Execution, Program Encoding, etc. these techniques have their own limitations. Code Obfuscation is a relatively new field of computing science which has the ability to practically secure the code. In this paper we explain the reasons and different methods of securing code using Code Obfuscation, and then we put forward a tool called MANGINS++ which is designed by us. MANGINS++ obfuscates any given Java program using three different techniques. Next, we discuss the credibility of the tool using few novel evaluation metrics. And finally, we show that Code Obfuscation could be a successful technique in protecting software and could even be extended to many fields like SOA, E-governance, cloud computing, etc. 1. INTRODUCTION Software Security is the need of the hour in the present day world, where over 4 out of every 10 [3] software are being pirated. Especially for the code written in languages like Java, the program can easily be reverse engineered back to identify the principle algorithms and data structures used. These reasons pose a great threat for the code security. We do have many mechanisms to protect this code and other Intellectual property against these malicious attacks. But still these techniques are not sufficient and the question of an omnipotent technique is still persisting. There are two approaches to protect one s software. The first one is to impose legal protection like getting Copyrights 3 on the software and signing legal contracts on producing the duplicates of the project etc. These techniques are not applicable to all the products and the process is quite cumbersome as well as time taking. An amateur developer could not use this technique to protect his code. The second approach is to use the techniques like Code authentication, Server Side Execution and Program encoding, etc, as these are the most widely used techniques in the present world. Code authentication, although preserves the Intellectual property of the developer by securing code, it imposes quite a few of rules and regulations on the usage of the program. These undesirable restrictions decrease the usability feature of the software. Server side Execution requires the user to be connected to the central server for using the software. The software is not given to the user, but he uses the software just as any other service on the internet. Hence the applicability of this technique for all software is low. Program encoding posses the inherent burden of carrying out a tough mechanism which is both quite time taking and also

647 Harsha Vardhan Rajendran et al., J. Comp. & Math. Sci. Vol. 1 (6), 646-652 (2010) requires additional software to encode and decode the program at both the ends. This technique will not preserve the semantic meaning of the code. Hence as all the above techniques have their own drawbacks in securing the code, and so we propose a robust, efficient and dependable technique called as Code Obfuscation. Obfuscation is a technique through which one can achieve practical security (if cost of attacks on the software is higher than software itself). Obfuscation is performed to change the visual appearance of code without affecting its functional output. This change in visual appearance will drastically affect chances of code getting reverse engineered. So to implement the various techniques of obfuscation we have designed a tool called as MANGINS++ (as it performs name mangling, insertion of dead codes and other novel operations). 2. DESIGN OF MANGINS++ The following pages give an overview of designing an obfuscator and also explain the operations of MANGINS++, indigenous obfuscator designed by us. MANGINS++ obfuscates Java code by applying all the three available transformations of obfuscation. They are Lexical transformation, Control transformation and Data transformation 2. One method has been implemented in each transformation; Name mangling in Lexical transformation, Insertion of dead code in Control transformation and Re-arrangement of arrays in Data transformation. MANGINS++ is a complete obfuscator as it demonstrates each of the three basic obfuscation transformations. In a standard code, the name of any variable gives away the kind of information it stores. For example in a student class the common variables include student_name, registration_number, etc. The information contained in these variables is easily understood. So, any reverse engineer would first target variables and decipher their content based upon their names. So, name mangling must be done so as to render these names meaningless such that these will not provide any hint to reverse engineer. MANGINS++ performs name mangling operation to implement above said solution. It mangles the variable names along with all instances of their occurrences throughout the program. The mangling criterion we have selected presently is reversing the name. But more complex ones like using a random function to generate a random name can also be used. Another method used by reverse engineers to understand the code includes creation of control flow graphs for the program. These flow graphs depict the flow of the program thus making their job a lot easier. So, to prevent this from happening, the reverse engineers must be made to beat around the bush rather than reaching the crux of the program. Insertion of dead codes does that. It perturbs the flow graph of the program resulting in a more complex flow graph, which even though produces the same result is a lot more difficult to decipher. MANGINS++ selects a random dead code from a pool and inserts it at strategic points in the code so as to obfuscate it. One other transformation implemented by MANGINS++ is the re-arrangement of arrays. This has two major benefits. First, it renders the code more complex, i.e. it increases the visual complexity of the code thus making it more difficult to understand. Second, even though a reverse engineer does find out the contents of an array, it will be difficult to extract information as the array would be sufficiently long with data stored at

Harsha Vardhan Rajendran et al., J. Comp. & Math. Sci. Vol. 1 (6), 646-652 (2010) 648 locations far apart. For example the array a[i] would be transformed to a [3765i+3243]. So, the location of data in this huge array would be difficult to zero in on. Also, this could be Screenshots of MANGINS++ performing: a) Name mangling b) Insertion of dead codes c) Re-arrangement of arrays 3. EVALUATION OF OBFUSCATION TECHNIQUES AND MANGINS++ extended to include quadratic equations making the deciphering of locations much more complex and also reducing the readability of code to a minimum. So, MANGINS++ offers all these three functionalities in one package. So, all three functionalities can be used on the same program to get maximum security. Also, it can be used any number of times too, hence rendering the final code impossible to read or decipher. Although the field of obfuscation started at around 1994, there are very few obfuscators which could supplement all the needs of the industry or the different developers. Also, as the field is relatively new, there are very few approaches towards evaluating the transformations that are used and also evaluating the Obfuscator tool itself. In the previous section we have seen the design and implementation of our indigenous obfuscator.

649 Harsha Vardhan Rajendran et al., J. Comp. & Math. Sci. Vol. 1 (6), 646-652 (2010) This section zooms into the evaluation of MANGINS++, techniques and transformations that have been used in and implemented using MANGINS++. 3.1 PROCEDURE OF UNDERSTANDING STATUS OF AN OBFUSCATED CODE The concept of understanding whether a code is obfuscated or not is a vital concept. Not all commercially available obfuscators obfuscate the code correctly. Often these codes can be made intelligible by using an efficient decompiler. Hence the following process is a most generic way to check if the code is obfuscated properly. The following are the steps in this process. Step 2: Input this code to a decompiler. Step 3: Understand the primitive data structures and algorithms. Step 4: Give that java source code to the obfuscator. Step 5: Obfuscate the code Step 6: Give the output of the obfuscated code to decompiler again. Step 7: If the decompiler fails and is stuck at a point or aborts by giving a valid exception (like null pointer exception) then the code is successfully obfuscated; else not. Figure 1: Fish Bone exercise on the procedure of Obfuscation Note: If decompiler successfully can retrieve the same data code then the code is not properly obfuscated. 3.2 METRICS Considering from the perspective of users, developers, content providers and the code itself we have broadly classified the evaluation process on to 3 metrics. They are 1. Obfuscation Definition Metrics 2. Software Engineering metrics 3. User Interface Metrics a) Obfuscation Definition Metrics Basic definition of any obfuscating transformation is that, it should preserve the functional definition of the code and is at

Harsha Vardhan Rajendran et al., J. Comp. & Math. Sci. Vol. 1 (6), 646-652 (2010) 650 independence to change the visibility of the code. The transformations that are included in the MANGINS++ should also stand by the above said definition [2]. Below is a table which show the reach of MANGINS++ to all the different kinds of transformations which Type of Transformation Technique used satisfy the obfuscation definition metrics. This proves that MANGINS++ is undoubtedly an obfuscator which has every kind of transformation with it. Implementation by MANGINS++ (Status) Lexical Transformation Name Mangling Public variables Private variables Protected variables Data transformation Structuring of int data type Data and arrays float data type char data type Any other predefined data type Control transformation Dead code insertion Type of Transformation Technique used Implementation by MANGINS++ (Status) Lexical Transformation Name Mangling Public variables Private variables Protected variables Data transformation Structuring of int data type Data and arrays float data type char data type Any other predefined data type Control transformation Dead code insertion Table: Obfuscation transformations Implemented by MANGINS++ Note: These metrics have been chosen keeping mind the basic functionalities to be satisfied while obfuscating the code. We can also expect a better Obfuscator to use any novel and complex technique to obfuscate the code, but still those techniques will fall under the same transformations specifies above. b) Software Engineering Metrics These metrics have been considered after an avid experience and supportive suggestions from many users of obfuscators. Naturally, it is a very hard job to change the visible appearance without affecting the size of the code or lines of code. Hence the factors like Dependability and Reliability of the code, etc. are undoubtedly affected. We, in this

651 Harsha Vardhan Rajendran et al., J. Comp. & Math. Sci. Vol. 1 (6), 646-652 (2010) section are not going to delve deep into the software engineering concepts to comment on the different features of obfuscated code. But we use these metrics just to show the affect due to code getting obfuscated. Moreover, the software metrics considered are Dependability and Reliability. The relationships below guide the use of metrics. Obfuscation increases lines of code (not in the case of Data transformations). If number of lines increases Complexity increases and number of errors increases. If Complexity increases Reliability as well as Dependability decreases (as MTTF decreases). Although, the above given relationships are true in general sense, because of the cautious measures taken in construction MANGINS++ there will be a little effect on these metrics like reliability and dependability. We have written dead codes (which are to be inserted at strategic points) in such a way that they no longer carry any relationship with the core code and also do not affect the time complexity of the code considerably. These metrics comes into play only while working with a huge code and we need a proper test bed to comment on the practical variations in time complexity of the code. c) User interface metrics Efficient, flexible, dynamic and complete User interface is necessary for any software [8]. Hence even an Obfuscator needs to have a good user interface. The reason is simple because it should support all range of users from novice users to experts. Few of the many metrics for evaluating the user interfaces are listed below. Navigation to the obfuscated code Range of Inputs to Obfuscator Dictionary to store the proper as well the mangled names Aesthetic appeal of the interface MANGINS++, presently has a sufficient user interface and is working on Java applet. So many of the user interface metrics show that there has to be a considerable development in the interface of MANGINS++. 4. RESULT AND DISCUSSIONS As explained above the result of the obfuscated program is with the name mangled variables and the file which has the dead codes inserted into it. All the variable names are not mangled explicitly because when all the variable names are mangled it will give a totally different code which increases the likeliness of the code to be cracked. With well constructed dead codes, the obfuscator is making attempt to make sure that run times for ordinary sources code and also for the obfuscated codes are comparable and cannot be distinguished that easily. As said earlier restructuring of arrays will result in complicating the method to find the next address location as well as well as to change the visual appearance of code. When it comes to MANGINS++ the results are promising and are evaluated with use of novel metrics. On a whole the results of this work are -- understanding the different techniques of protecting the software, obfuscation, design of obfuscator MANGINS++ and evaluating it based on different metrics. 5. CONCLUSION The main conclusion that can be drawn from this paper is that obfuscation is indeed one of the better methods to protect code. Even though other methods like server

Harsha Vardhan Rajendran et al., J. Comp. & Math. Sci. Vol. 1 (6), 646-652 (2010) 652 side execution, code authentication, etc. do exist, code obfuscation is a step ahead of them. The reason is that obfuscation is simple unlike the other methods which have complex implementations. It can be performed independently in a client system. Also, we argue that MANGINS++ is a better obfuscator than many available obfuscators as it performs all three transformations (lexical, data, control), and that too as many times as required on the same code so as to provide greater security. The evaluation with many novel metrics also proved the same i.e. MANGINS++ is a powerful tool which provides practical security to the code. So, on this note we conclude by saying that MANGINS++ is indeed a force to reckon with. REFERENCES 1. B. Barak, O. Goldreich,R. Impagliazzo, S. Rudich, A. Sahai, S. Vadhan and K. Yang, Impossibility result (2001). 2. Christian Collberg, Clark, Douglas Low, A Taxonomy of Obfuscating Transformations, in Technical report #148, (1998). 3. Levent Ertaul, Suma Venkatesh, Novel Obfuscation Algorithms for Software Security, Department of Mathematics & Computer Science, California State University, Hayward CA, USA. 4. L. D Anna, B. Matt, A. Reisse, T. Van Vleck, S. Schwab and P. LeBlanc. First attempt to survey (2003). 5. L. D Anna, B. Matt, A. Reisse, T. Van Vleck, S. Schwab and P. LeBlanc. Selfprotecting mobile agents obfuscation report. Technical Report #03-015, Network Associates Labs, June 2003: http://opensource.nailabs.com/jbet/papers/ obfreport.pdf 6. M.R. Stytz, J. A. Whittaker, Software Protection-Security s Last Stand, IEEE Security and Privacy, January/February, pp. 95-98 (2003). 7. M.R Stytz, Considering Defense in Depth for Software Applications, IEEE Security & Privacy, February (2004). 8. Mayur Kamat & Nishant Kumat on Code Obfuscation presentation by: http://ee.tamu.edu/ reddy/ee689_04/pres_ mayur_nishant.pdf 9. Business Software Alliance http://global.bsa.org/usa/press/newsrelease s/2002-06 10.1129.phtml?CFID=4661&CFTOKEN= 73044918. 10. Obfuscators subdirectory of Dmoz.org: http://dmoz.org/computers/programming/ Languages/Java/ Development_Tools/Obfuscators/ 11. Resources on Obfuscation (huge collection of links to related papers, books& companies): http://www.scs.carleton.ca/ hshen2/ (resources section).