Bug Detection Using Particle Swarm Optimization with Search Space Reduction

2015 6th International Conference on Intelligent Systems, Modelling and Simulation Bug Detection Using Particle Swarm Optimization with Search Space Reduction Arun Reungsinkonkarn Department of Computer Information System Assumption University arunjai139@gmail.com Paskorn Apirukvorapinit Faculty of Information Technology Thai-Nichi Institute of Technology paskorn-a@tni.ac.th Abstract A bug detection tool is an important tool in software engineering development. Many research papers have proposed techniques for detecting software bug, but there are certain semantic bugs that are not easy to detect. In our views, a bug can occur from incorrect logics that when a program is executed with a particular input, the program will behave in unexpected ways. In this paper, we propose a method and tool for software bugs detection by finding such input that causes an unexpected output guided by the fitness function. The method uses a Hierarchical Similarity Measurement Model (HSM) to help create the fitness function to examine a program behavior. Its tool uses Particle Swarm Optimization (PSO) with Search Space Reduction (SSR) to manipulate input by contracting and eliminating unfavorable areas of input search space. The programs under experiment were selected from four different domains such as financial, decision support system, algorithms and machine learning. The experimental result shows a significant percentage of success rate up to 93% in bug detection, compared to an estimated success rate of 28% without SSR. Keywords- Bug Detection; Fitness Function; Hierarchical Similarity Measurement Model (HSM); Particle Swarm Optimization (PSO); Optimization. I. INTRODUCTION A software bug is any fault or error which causes an unexpected results or an unintended behavior in computer programs. There are several reasons why bugs occur ranging from human error in code implementation to design errors from frameworks and operating system [1][2]. On the other hand, an error that caused by programmers unintentional coding logic is called semantic bug. To further explain, software bug behaves abnormally in program execution which results in unexpected outcomes. A particular bug can be a major root cause for a costly system failure. Sponsor by Bangkok University To develop a bug detection tool, several techniques have been proposed, however, their effectiveness are largely based on types of bugs [3][4][13]. Optimization is one of techniques which using in software bug detection including Particle Swarm Optimization (PSO), Genetic Algorithm, and Evolutionary Computation [14][15]. Each appropriates for different situation [5]. For example, PSO is used in test case generation in order to reach a global maximal while reduce number of test case significantly [6][12]. There is research paper that indicates the strategy so called Boolean formula which automatically generates test data for any implementations intended to expected specification. The effectiveness in fault detection of the strategy is identified both analytically and empirically. Yet, costs are evaluated based on compared test set size [7]. In conclusion, a tool of software bug detection is created from the two concepts which are Hierarchical Similarity Measurement Model HSM and Particle Swarm Optimization (PSO) with Search Space Reduction (SSR) [8][9]. HSM facilitates the fitness function creation to detect software behaviors where PSO with SSR manipulates inputs causing the actual output converged to expected output. II. RELATED WORK Organizational Evolutionary PSO has developed to be used in test case generation. It proves the reduction in number of test case by reaching global maximal [6]. Regression and Machine Learning method is also used to validate the predicted fault-proneness of software system. To illustrate, the method is used in Mozilla, the open source web and e-mail suite, in order to detect the fault-proneness of the source code. Yet, the metric are applied in several versions of Mozilla in order to observe how it changes during the development phase [10]. Several strategies for test data generation are presented to implement for effective fault detection [7]. Another technique is called Fault Injection, which is used to evaluate web vulnerability scanners. By injecting the common types of software faults in the web application code, results show the coverage of vulnerability and the percentage of false positives [4]. Calysto is another effective tool for automatic bugs detection. This tool is an inter-procedurally path-sensitive, fully-context sensitive, and bit accurate in modeling data operations. The result shows the discovery of bugs, completely automatically, and a very low rate of false error reports [11]. The effectiveness on detection of compiler optimization varies by the different optimization scenarios. Under the same race detection, it can produce different race of detection probabilities [3]. A Hierarchical Similarity Measurement Model (HSM) of program s execution avoids having to explicitly form an equation by working like a Black-box model. It uses a similarity value to compute a fitness function and supports primitive, abstract, and complex data types [8]. Applying the Search Space Reduction (SSR) to Particle Swarm Optimization (PSO) identifies the solution, which is not feasible to be found for problems, by reducing the excessive sexploitation step [9]. 2166-0670/15 $31.00 2015 IEEE DOI 10.1109/ISMS.2015.20 53

III. BUG DETECTION USING HSM AND PSO WITH SSR A software bug is program error that causes the program to behave in unintentional ways. This research introduces the method and tool of software bug detection based on HSM and PSO-SSR concepts. The tool is classified as a dynamic bug detection tool used during execution. Therefore source code is required for execution but there is no need for analyzing source code. Program specification is used to specify normal behavior and describe the expected output. After executing the program, if the actual output is not equivalent to the expected output, the program has an error. So in order to detect a bug, it is necessary to find an input (if possible) that makes the actual output differ from the expected output. For example, in installment program, the monthly payment should not exceed the principal. The proposed bug detection tool is divided into five parts, as shown in Figure 1. search space reduction. When the input that satisfies a condition of the program behavior is found the tool will store the related input and output with data logging module. Otherwise, the tool will repeat running HSM with PSO-SSR module until resources are exhausted. PSO uses a fitness function as a guide line to find a solution, an input to a program. However, the fitness function is not a typical mathematic equation that can be easily formulated. So HSM is used to help create the fitness function to examine a program behavior. The fitness function is computed, and the value can be between 0 and 1. When the value of the fitness function becomes one, it means that the solution is found. The fitness function is created based on the program behaviors and its expected output. The program behaviors can be a) normal behavior, for example, in the installment program, the principle should be more than zero or there is no negative monthly payment, and b) abnormal behavior, for example, no loan can have a monthly payment exceed the principle. In addition, the fitness function can be written in a form of one-line expression or a function. Stock analysis is program to recommend a stock trader to buy a stock. When the high price (HP) and low price (LP) of the stock are greater than zero, indicating that there is a trading volume in the market, therefore the recommend buying ratio (RBR) shall be more than zero, and RBR and stock trading density (STD) shall be between zero and one. The fitness function can be written as below: If HP >= 0 and LP >= 0 Then RBR < 0 and RBR > 1, STD < 0 and STD > 1 Traveling Salesman Problem (TSP) is another example that a fitness function has to be created with a function because of its behavior s complexity. The behavior of TSP must meet the following conditions: a) home town and last town must be the same, b) there must not be repeated towns in a route, and c) a traveling town must be in the list of towns. The fitness function can be illustrated as in Figure 2. Figure 1. Flow Chart of Research Tool. The input generator consists of a random number generator that randomly selects the input within the boundary of search space. After the input is generated, it will be fed to the execution part. Code execution is performed by C# compiler, and the actual output (AO), expected output (EO) and inputs are stored for every execution. HSM with PSO-SSR is then applied to compare the actual output with the expected output. Data type of output is determined to select an appropriate similarity measurement method (or technique) according to the hierarchical similarity measurement model. The similarity measurement method is calculated to compare the similarity between actual output and expected output. And the similarity value is stored as well. PSO-SSR is a technique, used to find an input, consists of three steps: exploration, exploitation and Figure 2. Example of TSP Fitness Function. 54

Figure 3. Screen Shot of the Tool. The Figure 3 illustrates the user interface of the developed tool with a number of essential details and ease of use. IV. EXPERIMENT Programs written for four domains are used in this research; Finance, Decision Support System, Algorithm and Machine Learning. We assume that 1) A program is executable by C# compiler 2) An output of program execution is limited to the data types supported in HSM [8]. The details of each program are shown in Table I. Install ment TABLE I. Application Domain Financial Module Application Type Web Service PROGRAM SELECTION SUMMARY. No. of Input(s) No. of Output(s) LOC Program Complexity 3 1 298 Low A. Experiment and Measurements In an experiment, ten particles are generated with random positions of input in the search space boundary. Twenty rounds of exploration with 20 rounds of exploitation per each exploration are executed. In total, 4000 particles with various positions will be executed for each experiment regardless whether the solution is found or not. Moreover the experiment will not stop at the first solution found because the tool may discover more than one solution. The computer specification used to run the experiment is Intel Corei7 2.2 GHz, 8 GB DDR3 of RAM and 1 TB 5,400 rpm of hard disk. To evaluate the performance of the bug detection tool, we define Success Rate (SR). The formula is SR = b / t (1) where b is number of bugs found and t is a total number of bugs fed into a program under experiment. The success rate measures the success in finding input that satisfies the fitness function. Whenever the value of the fitness function becomes one, it means that the bug is found. To determine the complexity of program behavior, we define Behavior Complexity Ratio (BCR) as BCR = 1 (f/e) (2) where e is the maximum round of exploitation and f is the fastest round that a solution is found. When solution is not found, BCR is zero. Behavior Complexity Ratio defines a degree of complexity of program behavior. High value of BCR indicates that the program behavior s complexity is low because the fastest round to a solution found is low, and many of the fitness functions are easy to satisfy. Stock Analys is Decision Support System Win App 3 2 488 Medium TSP Algorithm Console App 1 1 390 High Naïve Bayes Machine Learning Console App 3 1 679 High In our experiment, 29 bugs are seeded into programs for bug detection tool evaluation. Table II shows types of seeded bugs and criteria for program selection and bugs. TABLE II. A CRITERIA S SELECTION SET. Figure 4. Box Plot of the Fastest Round to a Solution Found. Bug seeded in program Incorrect conditions Incorrect calculation operator Incorrect order of calculation Incorrect order of statements Missing line of code Missing operator Criteria in Selecting a Program Various areas of application A certain degree of complexity 200 lines for minimum LOC Criteria in Seeded Bug Easily occurred or reproduced Should not be trivial Not easy to detect manually From Figure 4, the experimental results show that the median of the fastest round to the solution found of all programs are skewed to the left, meaning that the proposed method and tool produce an exceptional performance in detecting the bugs. For the Stock Analysis and Installment, their medians are below 200, indicating that the programs are not very complex, which corresponds to what already stated in the Table 1. For the Travelling Salesman Problem, the median of the fastest round to the solution found is nearly 500, 55

implying that the program is rather robust since it took the largest round compared to the other programs. Notice that the bar s length of the Naïve Bayes in the box plot is the longest compared to the shortest of the Stock Analysis. It indicates that solutions of Naïve Bayes are largely spread out all over the search space, whereas solutions of Stock Analysis are clustered in a particular area. To demonstrate the search space reduction capability, with the massive size of search space of approximately 15 million inputs in Naïve Bayes, the abnormal behavior is found at the 1600 th round with the SSR. This results from eight times of contraction of the search space. Figure 7. Average of Execution Time. Figure 7 shows the average execution (or running) time of all fitness functions for each program. Note that the average execution times of all programs under 1000 LOCs were approximately 60 seconds. We can conclude that our technique and tool can detect most of the bugs, with a low overhead of execution time. Figure 5. Success Rate. Figure 5 shows the success rate of each program under experiment. There were 27 bug founds out of 29 bugs seeded, thereby the overall percentage of the success rate is 93%. The two bugs cannot be found in the Stock Analysis and Naïve Bayes because not only inputs that influence the fitness function but constants that cannot be manipulated also affect the value of the fitness function. Figure 6. Behavior Complexity Ratio. Figure 6 shows the Behavior Complexity Ratio of all programs under experiment. The overall average of BCR is approximately 43%, suggesting that the complexity of selected programs varied from low to high. V. CONCLUSIONS The bug detection tool using PSO with SSR was developed using C# in Microsoft Visual Studio.Net and JSon. Four programs developed in four areas of application were used to evaluate the method and tool. Twenty-nine bugs are seeded into the programs, and 27 bugs are found. A significantly high percentage 93% of success rate was obtained from the experiment because of the capability of PSO with SSR to substantially reduce the search space. ACKNOWLEDGMENT This paper would not have been possible without the help, support and patience of the two persons. The author would like to specially thank to Mr. Sarawut Rasniyom for his support of the programming assistance in this research. The author also wishes to thank Ms. Waraporn Duriyavanich for her assistance in finishing this paper. REFERENCES [1] Gyimothy, T., Ferenc, R. and Siket, I., Empirical validation of object-oriented metrics on open source software for fault prediction, IEEE Transactions on Software Engineering, pp. 897 910, 2005. [2] R. Kuma, S. K. Pandey, S. I. Ahson, Security in Coding Phase of SDLC, Wireless Communication and Sensor Network, pp.188-120, 2007. [3] Changjiang Jia and Chan, W.K., Which compiler optimization options should I use for detecting data races in multithreaded programs, Automation of Software Test (AST), pp. 53-56, 2013. [4] Fonseca, J., Vieira, M., Madeira, H., Testing and Comparing Web Vulnerability Scanning Tools for SQL Injection and XSS Attacks, Dependable Computing, 2007. PRDC 2007. 13th Pacific Rim International Symposium, pp. 365 372, 2007. [5] Nenortaite J., Butleris R., Application of Particle Swarm Optimization Algorithm to Decision Making Model Incorporating Cluster Analysis, Human System Interactions, pp. 88-93, 2008. [6] Xiaoying P., Using Organizational Evolutionary Particle Swarm Techniques to Generate Test Cases for Combinatorial Testing, Computational Intelligence and Security, pp. 1580-1583, 2011. 56

[7] Weyuker, E., Goradia, T. and Singh, A., Automatically generating test data from a Boolean specification, Software Engineering, IEEE Transactions, pp. 353 363, 1994. [8] Reungsinkonkarn A., Hierarchical Similarity Measurement Model of Program Execution, 4 th IEEE International Conference on Software Engineering and Service Science, pp. 255-261, 2013 [9] Reungsinkonkarn, A., Apirukvorapinit, P., Search Space Reduction of PSO for Hierarchical Similarity Measurement Model, 11th International Joint Conference on Computer Science and Software Engineering (JCSSE), pp. 23 27, 2014. [10] Gyimothy, T., Ferenc, R. and Siket, I., Empirical validation of object-oriented metrics on open source software for fault prediction, IEEE Transactions on Software Engineering, pp. 897 910, 2005. [11] Babic, D., Hu, A.J., Calysto, 30 th International Conference Software Engineering ICSE '08. ACM/IEEE, pp. 211-220, 2008 [12] Chengying M., Xinxin Y., Jifu C., "Swarm Intelligence-Based Test Data Generation for Structural Testing", International Conference on Computer and Information Science, pp. 623-628, 2012. [13] Michalewicz, Z., Deb, K., Schmidt, M., Stidsen, T., Test-case generator for nonlinear continuous parameter optimization techniques, IEEE Transactions onevolutionary Computation, pp. 197-215, 2000. [14] Rudnick, E.M., Patel, J.H., Greenstein, G.S., Niermann, T.M., A genetic algorithm framework for test generation, IEEE Transactions oncomputer-aided Design of Integrated Circuits and Systems, pp. 1034 1044, 1997. [15] Gent, Kelson, Hsiao, Michael S., Dual-Purpose Mixed-Level Test Generation Using Swarm Intelligence, IEEE 23rd AsianTest Symposium (ATS), pp. 230 235, 2014. 57