Method for detecting software anomalies based on recurrence plot analysis



Similar documents
Efficient Detection of Ddos Attacks by Entropy Variation

Evaluation of Heart Rate Variability Using Recurrence Analysis

An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation

A Simple Feature Extraction Technique of a Pattern By Hopfield Network

STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM

Keywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age.

An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis]

Influence of Load Balancing on Quality of Real Time Data Transmission*

An Empirical Approach - Distributed Mobility Management for Target Tracking in MANETs

FALSE ALARMS IN FAULT-TOLERANT DOMINATING SETS IN GRAPHS. Mateusz Nikodem

High Quality Image Magnification using Cross-Scale Self-Similarity

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

A FRAMEWORK FOR MANAGING RUNTIME ENVIRONMENT OF JAVA APPLICATIONS

Improving the Performance of TCP Using Window Adjustment Procedure and Bandwidth Estimation

Modeling and Performance Evaluation of Computer Systems Security Operation 1

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April ISSN

CHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS

Load Balancing in cloud computing

A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters

Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems

Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors

The assignment of chunk size according to the target data characteristics in deduplication backup system

Optimization of AODV routing protocol in mobile ad-hoc network by introducing features of the protocol LBAR

Runtime Hardware Reconfiguration using Machine Learning

Biometric Authentication using Online Signatures

A Novel QoS Framework Based on Admission Control and Self-Adaptive Bandwidth Reconfiguration

REAL TIME TRAFFIC LIGHT CONTROL USING IMAGE PROCESSING

Comparative Analysis of Congestion Control Algorithms Using ns-2

Efficient Data Replication Scheme based on Hadoop Distributed File System

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data

An On-Line Algorithm for Checkpoint Placement

packet retransmitting based on dynamic route table technology, as shown in fig. 2 and 3.

MOBILE SOCIAL NETWORKS FOR LIVE MEETINGS

A Study of Network Security Systems

Fault Localization in a Software Project using Back- Tracking Principles of Matrix Dependency

Synchronization Analysis by Means of Recurrences in Phase Space

FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING

Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network

Network Based Intrusion Detection Using Honey pot Deception

Performance Evaluation of AODV, OLSR Routing Protocol in VOIP Over Ad Hoc

CHAPTER 5 WLDMA: A NEW LOAD BALANCING STRATEGY FOR WAN ENVIRONMENT

A SYSTEM FOR DENIAL OF SERVICE ATTACK DETECTION BASED ON MULTIVARIATE CORRELATION ANALYSIS

PERFORMANCE STUDY AND SIMULATION OF AN ANYCAST PROTOCOL FOR WIRELESS MOBILE AD HOC NETWORKS

A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING

Module 1: Introduction to Computer System and Network Validation

WAVELET ANALYSIS BASED ULTRASONIC NONDESTRUCTIVE TESTING OF POLYMER BONDED EXPLOSIVE

ADAPTIVE LOAD BALANCING FOR CLUSTER USING CONTENT AWARENESS WITH TRAFFIC MONITORING Archana Nigam, Tejprakash Singh, Anuj Tiwari, Ankita Singhal

Fault Analysis in Software with the Data Interaction of Classes

Improved metrics collection and correlation for the CERN cloud storage test framework

Automated Process for Generating Digitised Maps through GPS Data Compression

An introduction to OBJECTIVE ASSESSMENT OF IMAGE QUALITY. Harrison H. Barrett University of Arizona Tucson, AZ

A Novel Distributed Denial of Service (DDoS) Attacks Discriminating Detection in Flash Crowds

A Comparison Study of Qos Using Different Routing Algorithms In Mobile Ad Hoc Networks

An Active Packet can be classified as

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

Efficient Scheduling Of On-line Services in Cloud Computing Based on Task Migration

Master s Thesis. A Study on Active Queue Management Mechanisms for. Internet Routers: Design, Performance Analysis, and.

A simplified implementation of the least squares solution for pairwise comparisons matrices

A Catechistic Method for Traffic Pattern Discovery in MANET

Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies

GETTING STARTED WITH LABVIEW POINT-BY-POINT VIS

Abstract. 1. Introduction

Credit Card Fraud Detection Using Self Organised Map

Preventing DDOS attack in Mobile Ad-hoc Network using a Secure Intrusion Detection System

Analysis of IP Network for different Quality of Service

Bachelor of Games and Virtual Worlds (Programming) Subject and Course Summaries

Oscillations of the Sending Window in Compound TCP

TOPOLOGIES NETWORK SECURITY SERVICES

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER

A Survey on Load Balancing and Scheduling in Cloud Computing

Cloud Computing with Azure PaaS for Educational Institutions

Design call center management system of e-commerce based on BP neural network and multifractal

Dual Mechanism to Detect DDOS Attack Priyanka Dembla, Chander Diwaker 2 1 Research Scholar, 2 Assistant Professor

How To Get A Computer Science Degree At Appalachian State

SURVEY OF INTRUSION DETECTION SYSTEM

MEASURING PERFORMANCE OF DYNAMIC LOAD BALANCING ALGORITHMS IN DISTRIBUTED COMPUTING APPLICATIONS

Entropy-Based Collaborative Detection of DDoS Attacks on Community Networks

Spatial Data Analysis

System Aware Cyber Security

Securing PHP Based Web Application Using Vulnerability Injection

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

Taxonomy of Intrusion Detection System

QAV-PET: A Free Software for Quantitative Analysis and Visualization of PET Images

A Survey on Web Mining From Web Server Log

A SHORT NOTE ON RELIABILITY OF SECURITY SYSTEMS

On demand synchronization and load distribution for database grid-based Web applications

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Energy aware RAID Configuration for Large Storage Systems

Software Tracing of Embedded Linux Systems using LTTng and Tracealyzer. Dr. Johan Kraft, Percepio AB

ADVANCED APPLICATIONS OF ELECTRICAL ENGINEERING

Kalman Filter Applied to a Active Queue Management Problem

An Approach to Load Balancing In Cloud Computing

Transcription:

Journal of Theoretical and Applied Computer Science Vol. 6, No. 1, 2012, pp. 3-12 ISSN 2299-2634 http://www.jtacs.org Method for detecting software anomalies based on recurrence plot analysis Michał Mosdorf Warsaw University of Technology, Institute of Computer Science, Poland m.mosdorf@stud.elka.pw.edu.pl Abstract: Presented paper evaluates method for detecting software anomalies based on recurrence plot analysis of trace log generated by software execution. Described method for detecting software anomalies is based on windowed recurrence quantification analysis for selected measures (e.g. Recurrence rate - RR or Determinism - DET). Initial results show that proposed method is useful in detecting silent software anomalies that do not result in typical crashes (e.g. exceptions). Keywords: anomaly detection, fault injection, recurrence plot, software dependability 1. Introduction Detection of software anomalies caused by various failures is important part of software dependability methods. It allows us to make decision to undertake corrective actions in case of detected failures. Literature mentions different methods for detecting software failures. Part of the methods for detecting software anomalies is based on building very accurate and often formal assertions that check program invariants. Those methods usually impose program execution overhead and necessity to optimize obtained assertion set. Authors of the [1][2] present technique allowing to detect software faults based on dynamic derivation of detectors that check discovered invariants. Those methods are based on analysis of Dynamic Dependence Graph (DDG) [3][4] that represents dependencies of values observed during program execution. Another example of similar approach is presented by DAIKON tool [5] that can be used for generating assertions set. Different approach for failure detection methods is presented in [6][7][8] where authors describe two techniques EDDI (Error Detection by Duplicated Instructions) and CFC (Control Flow Checking). EDDI technique is based in the idea of duplicating software instructions and inserting additional instructions that compare results obtained from original and redundant instructions. CFC realizes control flow checking by generating sun-time signatures for control change that are verified by different program blocks. This paper evaluates alternative method for detecting software execution anomalies based on the recurrence plot analysis. Presented method focuses rather on detecting anomalies in dynamics of data flow than checking value invariants in examined software. This approach was proposed for the purpose of detecting anomalies that are caused by software errors that do not result in typical application crashes that are relatively easy to detect and compensate. The proposed approach aims at detecting software errors that result in change of overall dynamical properties of software data flow that is characterized by a few recurrence plot quantification measures.

4 Michał Mosdorf This paper is organized as follows. The section two gives a short overview of recurrence plot analysis and used Recurrence Quantification Analysis measures. Next section describes the proposed method for software anomaly detection. Next section describes architecture or artificial software model used for the evaluation of proposed method. Next section describes software anomalies introduced in used software model and events caused by those anomalies. Then the paper describes proposed methodology and presents obtained results. The end of the work contains conclusions and future plans. 2. Recurrence Plot analysis Recurrence plot is a technique for nonlinear data analysis that allows us to investigate recurrent behavior in m-dimensional phase space trajectory through 2 dimensional representation. This technique was first introduced by Poincaré in 1890 [9]. Calculation of the recurrence plot starts with reconstruction of phase space of dynamical system. For this purpose there can be applied time delay method with autocorrelation function that allows us to calculate time delay τ [10]. During the next step there can applied Grassberg-Procaccia method for calculation of dimension required for attractor reconstruction. Recurrence plot that visualizes recurrences is described by the matrix: where: N is the number of considered states xi in m dimensional space, ε is the threshold distance, - a norm and Θ( ) - the Heaviside function. The proposed method for detecting software anomalies is based on windowed Recurrence Quantification Analysis (RQA) for selected measures (e.g. Recurrence rate - RR or Determinism - DET). Anomalies are reported based on change of selected RQA measures. Results of this research focus mainly on two parameters: RR and DET that are obtained with following the equations [10]: (1), (2) where: N number of points on the phase space trajectory, P(l) histogram of the lengths l of the diagonal lines, - neighborhood size. RR measures the density of recurrence points in recurrence plot. DET shows the ratio of recurrence points that form diagonal lines to all recurrence points. 3. Software anomaly detection method Discussed method is based on the idea of comparing results of windowed RQA analysis of traces data generated from program execution without anomalies and program execution that may be influenced by anomalies. Figure 1 shows steps of the proposed method. In the first step the examined software must be executed without anomalies to gather not disturbed execution trace. Trace log contains series of integer values that represent different transitions in the program state (e.g. function calls) In the next step the obtained execution trace is analyzed with autocorrelation function and Grassberg-Procaccia method to determine delay and dimension required for the attractor reconstruction. With those quantities there is performed windowed RQA analysis of the obtained trace log. Time series obtained after this analysis describes dynamical properties of (3)

Method for detecting software anomalies based on recurrence plot analysis 5 not disturbed software execution and it is used as comparison pattern for software anomaly detection. Figure 1. Algorithm of proposed anomaly detection method During anomaly detection process obtained RQA analysis data is compared with RQA data generated from original software execution. At current stage of method development this comparison is performed offline after the completion of software execution. This assumption was made to simplify the evaluation of proposed approach. Future work will be focused on development of method allowing for real-time software anomaly detection and classification of different dynamical states of software. 4. Architecture of tested software The proposed approach was verified during experiments performed on artificial software model that simulated messages flow between separated threads. The aim of this model was to simulate typical data flow between different modules of e.g. real time software divided to separate application threads that can be found in typical software based on operating systems like FreeRTOS or RTEMS. Architecture of tested software is shown in fig 2. Figure 2. Architecture of tested software

6 Michał Mosdorf The prepared software consists of one sender thread that generates message events with Poisson distribution. Each message contained randomly generated designator and additional number describing the amount of time required to process it by receiver thread. This number was also generated randomly with Poisson distribution. Each message was inserted into first queue that connected the sender thread with router thread that was responsible for routing received messages to correct destination queue according to destination designator. In the presented model there were 6 different receiver threads grouped into 3 groups. Each thread group was responsible for receiving messages from given group queue. For the purpose of creating execution trace the selected program points were equipped with log generation procedures. For the whole program there were selected 13 points which represented message generation, send and receive events by different threads. Execution of each selected point resulted in generation of log containing single integer number in range of 1 to 13. 5. Simulated anomalies During the experiment there were collected 6 different execution traces. One for the proper execution and 5 for different simulated software anomalies. Anomalies were introduced artificially and concerned the amount of time required to handle message at destination thread or status of the tread (enabled or disabled, by default all threads were enabled). Each trace was collected for 3 minutes and contained about 14k reported events. The below list provides more details about collected trace logs. Execution without anomalies 1. A1 thread requires 2 times larger time to handle messages 2. A1 thread requires 4 times larger time to handle messages 3. A1 is not working 4. A1 and B1 require 2 times larger time to handle messages 5. A1 and B1 are not working For all the experiments there was made the assumption that if router thread was not able to insert message to receiver queue thread then message was lost (queue was full). There was no particular trace for such kind of event. Figure 3 shows the example of time series gathered for execution without anomalies. Figure 3. Example of time series from execution trace without anomalies

Method for detecting software anomalies based on recurrence plot analysis 7 It is important to notice that the software test model was tuned in such a way that without anomalies the program was working in stable way. The amount of messages in all queues was maintained at low level and none of the messages were lost. Due to introduced anomalies there were observed special events caused by anomalies. The below list gives a short description of those events for anomalies from 1 to 5. (1) Queue A full at 1 minute and 50s (2) Queue A full at 1 minute and 10s (3) Queue A full at 1 minute (4) Queue A full at 1 minute and 40s and queue B full at 2 minutes and 40s (5) Queue A full at 55s and queue B full at 1 minute For the initial examination of obtained trace logs from different executions, all reported program points were counted. Results are presented in the fig. 4. As it is visible the initial inspection of the results is not showing a lot of difference between gathered trace logs. Such kind of inspection can only show differences in number of registered points that were associated with given threads operations. Total number of calls for thread A1 decreases for anomalies 1, 2 and 3 what is caused by introduced anomalies that increase the required time to handle message received from queue (log number 6). Figure 4. Number of different program points occurred in analyzed execution trace logs 6. Analysis of obtained results In the first stage of execution trace analysis without anomalies was analyzed with autocorrelation function and Grassberg-Procaccia method to determine the delay and dimension required for the attractor reconstruction. Also value of was selected based on execution trace without errors (required for recurrence plot calculation). In the next step for the each of the execution traces with anomalies there were created many recurrence plots with window size of 300 samples. For each of the resulting recurrence plots there were calculated selected RQA measures. Figure 5 shows the example of calculated recurrence plot for selected window size for trace log collected from execution without anomalies.

8 Michał Mosdorf Figure 5. Example of calculated recurrence plot for window size of 300 sample of trace log collected from execution without anomalies Figure 6 shows the example of recurrence plot calculated for trace log collected from execution with anomaly 5. It is visible that both presented recurrence plots differ in number and structure of recurrence points. Figure 6. Example of calculated recurrence plot for window size of 300 samples of trace log collected from execution with anomaly 5 Figures 7 and 8 show calculated RR and DET measures for trace logs without anomaly and with anomaly 5. It can be observed that RR and DET series are noisy. It can be noticed on both figures that at about 30% of experiment time series associated with Anomaly 5 drastically change value. This is caused by anomaly 5 event when queues A and B become full. Additionally the value of RR from the beginning of the experiment shows that data from Anomaly 5 execution trace has different dynamic character than original data without anomalies.

Method for detecting software anomalies based on recurrence plot analysis 9 Figure 7. RR measure calculated for trace logs without anomalies and with anomaly 5 Figure 8. DET measure calculated for trace logs without anomalies and with anomaly 5 Due to presence of noise in RR and DET series, some anomalies may be difficult to distinguish from original series. Because of that, figures 9 and 10 show series obtained from original series with moving averaging window with size of 500 samples. After that the anomaly series can be easily distinguished from original data obtained from trace log of system without anomalies.

10 Michał Mosdorf Figure 9. RR measures calculated for all trace logs containing data without anomaly and with all simulated anomalies. Original plot was filtered by moving averaging window with size of 500 samples. Fig. 10. DET measures calculated for all trace logs containing data without anomaly and with all simulated anomalies. Original plot was filtered by moving averaging window with size of 500 samples Presented results show that RR and DET measures from trace log without anomalies maintain rather similar values in relatively small range. This fact is caused by stable character of program execution without anomalies. In case of all introduced anomalies RR measure value after averaging was different than the value computed from trace log without anomalies. This property allows us to distinguish executions with the anomalies from the original one. Additionally it can be observed that values of both measures for trace logs with anomalies change in much greater range. This fact is caused by the effect of the anomalies that caused affected queues to maintain higher

Method for detecting software anomalies based on recurrence plot analysis 11 amount of data and eventually become full. This effect is especially visible in case of Anomaly 5 that causes very rapid increase of amount of messages maintained in queues A and B and queue blockage in relatively short time. 7. Conclusions The presented paper proposed method for software anomaly detection. Described approach is based on the idea of performing windowed RQA analysis on software execution trace logs and making decisions about anomaly detection based on comparison of RQA measures calculated for original not disturbed software execution. For the evaluation purpose, the method was applied to very simple and artificial software model that simulated messages flow between different program threads. For that model there were introduced 5 different anomalies that influenced performance of threads responsible for handling messages. Created anomalies disturbed stable character of the model and caused affected queues to maintain higher level of messages. Results obtained for performed tests showed that RQA measures allowed to distinguish executions with anomaly from original execution. Results of initial study show that recurrence plot analysis can be useful tool suitable for detecting anomalies in software execution. Results show that this approach can help us to detect silent software errors that do not result in typical application crashes (e.g. exceptions). This type of errors may result in change of system statistical behavior or performance degradation. In future this method can be applied for anomaly detection in more complex systems such as kernel of operating system. Drawback of this solution is high computation power required to perform recurrence plot analysis. Due to this, applicability of the method for real time applications will be investigated in future research. Additionally, due to the presence of noise, data obtained from RQA analysis may be difficult to read. In the presented paper there was used additional windowed average to show the differences between anomalies series and original series. Due to that fact making reliable and rapid decision about possible anomaly detection may be difficult. This issue will be investigated in future work. References [1] Pattabiraman K., Kalbarczyk Z., Iyer K. R., Application-Based Metrics for Strategic Placement of Detectors, Dependable Computing, 2005. Proceedings. 11th Pacific Rim International Symposium on, 12-14 Dec. 2005 [2] Pattabiraman K., Saggese G. P., Chen D., Kalbarczyk Z., Iyer K. R., Dynamic Derivation of Application-Specyfic Error Detectors and their Implementation in Hardware, Dependable Computing Conference, 2006. EDCC '06. Sixth European, 18-20 Oct. 2006 [3] Austin T. M., Sohi G. S., Dynamic Dependency Analysis of Ordinary Programs, ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture, 1992 [4] Tip F., A Survey of Program Slicing Techniques, JOURNAL OF PROGRAMMING LANGUAGES, Volume: 5399, Issue: 3, Publisher: Citeseer, Pages: 1-65, 1995 [5] Ernst M.,Cockrell J.,Griswold W., Notkin D., Dynamically Discovering Likely Program Invariants to Support Program Evolution, IEEE Trans. on Software Engineering, 27(2), 2001Trans. on Software Engineering, 27(2), 2001.

12 Michał Mosdorf [6] George A. Reis, Jonathan Chang, Neil Vachharajani, Ram Rangan, David I. August, SWIFT: Software Implemented Fault Tolerance, Proceedings of the 3rd International Symposium on Code Generation and Optimization, 2005. [7] N. Oh, P. P. Shirvani, and E. J. McCluskey. Control-flow checking by software signatures, volume 51, pages 111 122, March 2002. [8] N. Oh, P. P. Shirvani, and E. J. McCluskey. Error detection by duplicated instructions in super-scalar processors.ieee Transactions on Reliability, 51(1):63 75, March 2002. [9] Poincaré H., Sur la probleme des trois corps et les équations de la dynamique, Acta Mathematica 13 (1890) 1 271. [10] Norbert Marwan, M. Carmen Romano, Marco Thiel, Jürgen Kurths, Recurrence plots for the analysis of complex systems, Physics Reports, Volume 438, Issues 5 6, Pages 237 329, January 2007