Detecting Symbian OS Malware through Static Function Call Analysis
|
|
|
- Lynne Stafford
- 9 years ago
- Views:
Transcription
1 Detecting Symbian OS Malware through Static Function Call Analysis Aubrey-Derrick Schmidt, Jan Hendrik Clausen, Ahmet Camtepe, and Sahin Albayrak Technische Universität Berlin - DAI-Labor Ernst-Reuter-Platz 7, Berlin {aubrey.schmidt, jan.clausen, ahmet.camtepe, sahin.albayrak}@dai-labor.de Abstract Smartphones become very critical part of our lives as they offer advanced capabilities with PC-like functionalities. They are getting widely deployed while not only being used for classical voice-centric communication. New smartphone malwares keep emerging where most of them still target Symbian OS. In the case of Symbian OS, application signing seemed to be an appropriate measure for slowing down malware appearance. Unfortunately, latest examples showed that signing can be bypassed resulting in new malware outbreak. In this paper, we present a novel approach to static malware detection in resource-limited mobile environments. This approach can be used to extend currently used thirdparty application signing mechanisms for increasing malware detection capabilities. In our work, we extract function calls from binaries in order to apply our clustering mechanism, called centroid. This method is capable of detecting unknown malwares. Our results are promising where the employed mechanism might find application at distribution channels, like online application stores. Additionally, it seems suitable for directly being used on smartphones for (pre-)checking installed applications. 1 Introduction Our daily lives become more and more dependent upon smartphones due to their increasing capabilities. Smartphones are used for various purposes including web browsing or online payment. Thus, security threats to these devices become more and more critical. Malware writers started creating malware for smartphones in 2004, mainly targeting Symbian OS. Symbian introduced mandatory application signing in version S60 3rd in for coping with this problem. Application sign- 1 First S60 3rd device shipped in March 2006 named Nokia 3250 ing was performed by third-party companies that checked submitted applications for a certain set of requirements, e.g. proper memory handling and certificate level. The certificate basically grants access to different kinds of API calls basing on the privilege level. The signing mechanism turned out to prevent malware development for this platform until the first spyware for Symbian OS 3rd was published [15]. After this spyware appearance it took almost 3 years until the first malware for the 3rd version appeared [6]. The reason for this is not obvious. On the one hand one could argue that the market share of Symbian OS is not as big as the one of MS Windows. On the other hand the complexity of Symbian OS programming might distract malware writers looking for fast results. Therefore, we can state that applying third-party application checking is a good approach towards malware prevention while applied mechanisms should consider more comprehensive checks for detecting malware. In this work, we apply static analysis of function calls in order to detect malicious applications. This presents a novel approach to static malware detection in mobile environments and can extend third-party application checking for increased application security. Additionally, not only signing processes 2 can benefit from this approach: platforms mainly using online application stores can also employ this type of analysis for detecting malicious software, e.g. Android 3 or iphone 4. These online stores require the submission of the to be published application which is an appropriate time for applying the analysis. In some cases it is possible to bypass application stores for downloading software. Therefore, we also consider the option of moving these checks directly to the mobile device. Basing on common clustering methods, we developed a light-weight algorithm called centroid machine. This algorithm is used for detecting, in particular, Symbian OS malware on the basis of function calls which suffices the re- 2 as known from Symbian OS
2 quirements of mobile devices, e.g. efficiency, speed and limited resource usage. The results of the centroid machine are compared with the results of the light-weight Naive Bayes classifier [11] as well as with a heavy-weight support vector machine method. Our approach is not limited to Symbian OS, but since this platform has the highest amount of available malware among smartphone systems, we chose it for validation purpose. This work is structured as follows. In Section 2, we present related work on static analysis for malware detection. In Section 3, we describe how we collected our data set on which our work is basing on. Section 4 presents our approach towards static analysis of Symbian OS function calls for detecting malicious applications. In Section 6, we conclude. 2 Related Work Static analysis is a quite new field for mobile malware research, therefore most related publications still refer to stationary systems, like PCs. Wang et al. [16] observe the usage of DLL and API methods for training support vector machines (SVM) for behavior detection. Behavior-based approaches normally suffer from a high false positive rate while needing a significant amount of processing power, storage, and memory. The authors claim an accuracy of 99% but do not present the corresponding detection rate which would help to rate the quality of the results. Egele et al. [5] describe a static analysis for PHP web applications checking the requests while considering the call parameters for creating more precise detection models. Christodorescu et al. [3] use static analysis for creating assembler-based program automatons for detecting malicious activity using the disassembler IDA Pro. In particular, they address the problem of obfuscation which current commercial anti-malware application still can not handle properly. Their tool is called static analyzer for executables (SAFE) and seems to be an appropriate approach for handling malware through an stationary system. The drawback is that on-device detection requires light-weight methods complicating a possible transfer of the presented approach to a limited mobile device. Bayer et al. [1] present a tool named TTanalyze. This tool is able to monitor Windows applications dynamically by monitoring usage of system and API calls. The focus of this work does not lie in the detection of malware, the aim is to understand the malware behavior for decreasing the window of vulnerability. Bergeron et al. [2] perform a semantic analysis of binary code. Their approach is separated into three stages: (1.) creation of an intermediate representation, (2.) flow-based behavior analysis, and (3.) static verification of critical behaviors against security policies. Flow-based analysis is a valuable technique for investigating malware but currently not suitable for smartphones due to resource constraints. Krügel et al. [7] use static analysis on binaries in order to detect kernel-level rootkits via instruction sequences before corresponding modules get loaded into kernel. They state that their prototype did not produce any false-positive while detecting all tested rootkits. Additionally, the authors refer to a problem caused by the exponential explosion of possible paths that need to be followed which clearly indicates that currently, this approach cannot be applied to smartphones. These path are created by creating states of the observed machine for analysis of control flows. Provos [9] wrote a tool capable of generating and enforcing policies concerning system calls. The Systrace tool is intended to be efficient and does not impose significant performance penalties while currently aiming for stationary Linux/Unix systems. Warrender et al. [17] compare sequences of system calls in order to distinguish normal from abnormal behavior. They test four methods with increasing complexity, e.g. Hidden Markov Models (HMM) and come to the conclusion that although HMM gives the best accuracy the less complex ones are sufficient. The problem remains that analyzing call sequences is a complex task currently not suitable for smartphones. Moser et al. [8] investigate the limits of static analysis for malware detection. They state that using obfuscation detection via static analysis can be evaded. Therefore, static analysis should be extended by dynamic analysis. In our case we belief that our approach is less vulnerable to some of the mentioned obfuscation techniques. Considering changes in call sequences, our approach will ignore any of these since simple statistics on function call occurrences are checked. Schmidt et al. [13] use system and library functions for comparing malware with benign software on Android. Using state-of-the-art classifiers resulted in high detection rates with moderate false-positive rate. Since Android lacks corresponding malwares, Linux malwares were ported to this platform for building a source of malicious software. Applicability was checked by comparing original source code with ported source code where the results showed only minor and ignorable differences. 3 Function Call Extraction from Symbian OS Executables Only few malwares are known for current Symbian OS 3rd. First malware targeting Symbian OS 3rd appeared in February 2009 which used a valid certificate. This hap-
3 pened shortly after Mulliner 5 presented a way to bypass the security mechanisms of Symbian OS 3rd at Black Hat Conference 2008 in Japan. Mulliner additionally stated that he was wondering why no one else was trying this common approach earlier. These events suggest us that the number of such malwares developed may increase substantially in near future. In this work, we consider using the former Symbian platform version, namely Symbian OS 2nd, for benefiting from the huge amount of existing malwares. Although these binaries base on the older version of Symbian OS, we believe that the results of this work can also be applied to the newer versions of Symbian OS. The main reason for this is that from a function call perspective most of the calls remained the same while some were removed and new ones were added 6. Over 300 malwares appeared up to date for Symbian OS 2nd, [12]. We start by eliminating malwares which are simple file containers (installers) overwriting critical files and which are based on similar code bases sometimes only changing the name of the installation file or the installation note. After filtering, we ended up with data sets consisting of 33 Symbian OS 2nd malwares as well as on 49 popular applications for the same version. 33 malwares obviously do not form a statistically proof set but it is the only possibility to work on real data for smartphone platforms. One could argue that researching stationery systems can lead to transferable solutions for smartphones. But key differences between these systems have to be taken into account: Smartphone are highly connected while frequent connection changes through different networks interfaces are common, e.g. 2G, 2.5G, 3G, WiFi Smartphones are single-user systems in most cases which allows to disregard aspects of multi-user systems. Most cellular networks use NAT to assign IP- Addresses to cellular- and smartphones which decreases the possibility of attacking IP addresses directly. Smartphone operating systems allow to develop security systems in a less complex environment. Although smartphones provide various functionality, their main purpose is the usage of communicationcentric services, like phone, messaging, and nowadays Internet applications. 5 Security Researcher from Berlin, Germany 6 Differences_between_S60_2nd_and_3rd_Edition Table 1. Mapping of variables and functions Variable Mapping database of executables x executable Ω union of function calls from all x ω function call in Ω P(Ω) power set of Ω b class of benign executables in m class of malicious executables in Θ most frequent calls from both m and b Λ top 14 calls from Θ ˆµ m (ω) frequency of attributes in m ˆµ b (ω) frequency of attributes in b ˆσ b (ω) standard deviation in b Most critical threats to smartphones are DoS attacks, information stealing, and financial service charge abuse. These aspects encourage us to stick to real malwares for research. Ignoring research on such platform due to limited data sets can result in millions of unprotected users. We used IDA Pro 7 for extracting the function calls. Comprehensive tutorials can be found online as well as in [4]. While exploring the installation binaries, we faced the problem that not all could be unpacked by IDA Pro. To solve this problem we used a tool called UnSIS 8 that was able to give us access to the corresponding files. As an alternative, the tool SISInfo 9 can be used to do the same. We state that our findings can be applied to mobile devices. One drawback of this statement is of course that IDA Pro is not available for any smartphone platform. But our research has shown that implementing similar relevant functionality on current and future smartphone platforms will be possible. In case of Android you can install the readelf 10 application which delivers detailed information on relocation and symbol tables of each ELF object file. Most interestingly, it outputs the static list of referenced function calls for each application chosen. 4 Static Function Call Analysis of Symbian OS Soft- and Malware 4.1 Descriptive Statistics When extracting function calls we only take into account the function name itself without considering parameters or sdks/developer_tools/critical/unsis/index.jsp 9 jpsukane/ sisinfo.html 10
4 Figure 1. The x-axis displays the percentage of malware, the y-axis displays the number of calls from Ω which appear in the malware. The dotted line reveals that there are about 50 calls which appear in 70% of the malwares. arguments. The set of all functions, appearing in at least one of the executables in our database, will be denoted by Ω, the elements of which are the attributes in our machine learning approach. The static function call analysis retrieves for each executable x a set of functions, which establishes the map ζ : P(Ω) x ζ(x), where P(Ω) is the power set of Ω. In this section we reveal some facts on the appearance of function calls in the executables we analyzed and present a descriptive statistic on the attribute distributions. In our database we deal with 33 malicious and 49 benign programs. Overall, 3620 unique function calls where discovered, where 254 of these only appeared in malware, i.e. they are never called by any benign program. The graph in Figure 1 gives a rough overview on the attribute distribution on the malware class. It reveals that there are almost 50 attributes in Ω which appear in 70% of the malware, and about 40 which appear in the static analysis of 80% of the malware. Note that the curve in Figure 1 declines steeply beyond the 85% mark. Malware detection with a rate higher than 90% is unreachable by simple methods, such as looking at single attributes. The results from Figure 1 clearly emphasize that despite of that we already filtered out malwares with similar code bases the other still remain similar. This underlines estimations that most Symbian OS malwares base on a very small set of initial malware source (1) code. Feature Extraction. To make our detection system more efficient it should only be based on a subset of Ω. Additionally, one should not rely on single malware-specific calls since they may be replaced or omitted in future. To increase robustness with respect to such call replacements, one should take calls into account which are standard and widely used, i.e. calls appearing frequently in both malware and benign programs. The set of these calls will be denoted by Θ. Some characteristics of the most malware-typical calls in Θ are presented in Figure 2 and will be called Λ. An attribute ω Ω is regarded as more malware-typical the greater the quantity gets. t(ω) = ˆµ m (ω) (ˆµ b (ω) + 3ˆσ b (ω) ) (2) where ˆµ m (ω) = 1 m x m 1 {ω ζ(x)} is the frequency of the attribute within the malware class m (for 1 being the indicator function which equals 1 if the underscored statement is true and 0 if not), ˆµ b (ω) the frequency within the benign programs, 1 ˆσ b (ω) = (1 {ω ζ(x)} ˆµ b ) b 1 2 x b the empirical standard deviation within the benign class b. Three times σ is chosen in (2) since it is a common empirical rule, meaning that for a real random variable almost all
5 Figure 2. The x-axis shows the attributes of Λ, the y-axis their appearance frequencies: black displays the frequency in benign programs (ˆµ b (ω)), light-gray the frequency in benign programs plus three times standard deviation (ˆµ b (ω) + 3ˆσ b (ω)), dark-gray the frequency in malware (ˆµ m (ω)). The longer the dark-gray bar gets, the higher the probability is that a program is malicious, premising the call occurs in the static analysis. values lie within 3 standard deviations of the mean [10]. As our final feature set Λ, we picked attributes with the greatest t(ω) > 0 values and ω Θ. 4.2 Detecting Malware by Means of Machine Learning With the simple statistical methods of the previous section, detection rates of more than 90% can not be achieved. In order to gain better detection precision and accuracy we employ machine learning techniques. By applying appropriate models, we take the interdependencies of attributes into account. We propose an algorithm called centroid machine, designed for detecting symbian malware on the basis of function calls. This suffices the requirements of mobile devices, i.e. efficient, fast and light-weight. Furthermore, we compare the quality of centroid machine with a version of a support vector machine and a naive Bayes classifier [11] since every simple classifying algorithm has to keep pace with these state of the art approaches. Definition of Centroid Machine. The centroid machine classifies an executable via clustering. Each cluster is defined by a centroid, where the clusters are called c m and c b for the malicious and benign classes respectively. An executable is classified as malicious if its closer to c m and benign if it is closer to c b. To make such distance calculations possible, we have to map the set of attributes into a metric space via the kernel function ˆk : P(Ω) R Ω C Ω i=1 1 {ω i C} e i (3) for an ordered Ω = {ω 1,..., ω Ω }. The set R Ω forms a metric space with the euclidean distance d. After applying the attribute-extracting function ζ from (1), we attain a kernel which operates directly on the set of executables k = k ζ. A centroid for a class j { malicious, benign } is chosen in a way that minimizes the sum of squared euclidean distances d to all data points of her class j c j = min d 2 (ˆµ, x). ˆµ R Ω x j Then, our classifier can be defined as a map C assigning each executable one of the classes c m or c b. { malicious, if d(k(x), cm ) < d(k(x), c C : x b ) benign, else. (4)
6 Table 2. Statistical figures characterizing the quality of different learning models, excerpt from a 10-fold cross-validation. Model Malware Detection Rate Accuracy Attribute Set Attributes Centroid Machine Ω 3620 Centroid Machine Θ 254 Centroid Machine Λ 14 Naive Bayes Λ 14 Binary SVM Λ 14 In order to vary the sensitivity we introduce an alarm threshold β [0, ] such that any x is classified as malware if d(k(x), c m ) d(k(x), c b ) < β (5) and classified as benign otherwise. This means with increasing β, probability increases that arbitrary checked executables will be detected as malicious. Figure 3 visualizes the aforementioned classification. 5 Results and Discussion d(k(x),cb) Cb k(x) d(k(x),cm) Cm Figure 3. Example clusters of executables with benign center of gravity c b and malicious c m in relation to checked executable k(x). For our statistical investigation we performed 1000 runs of ten-fold cross-validation, where in each loop execution the data is folded randomly into a training set containing 9/10-th of the data and a test set containing the remaining one tenth. We applied the algorithms with different attribute sets, which are the aforementioned Ω, Θ, and Λ. As a representative for support vector machine classifiers, we employed the implementation of Chang and Chih-Jen 11 based on the algorithm proposed by Scholkopf et al. [14] with standard parameters. We also varied the SVM parameters, but results did not improve significantly. We used MatLab 12 as learning environment. 11 libsvm: a library for support vector machines, 2001, cjlin/libsvm 12 MATLAB, The Language of Technical Computing, R2008b, The MathWorks, Inc. On Table 2, the averages of classification rates are displayed. Note that centroid machine outperforms naive Bayes and has even a better accuracy than the heavy-weight support vector machine. Also note that the dramatic attribute reduction from 3620 to only 14 is accompanied by an acceptable decline of detection rate and accuracy. This reveals the possibility to only take a relatively small subset of Ω into account while keeping the discriminative potential of the attribute set. This potential small subset of features also allows moving detection logic to mobile devices while not encumbering the devices significantly. By varying the alarm threshold β of the centroid machine in Formula 5 the sensitivity varies. The resulting ROC-graph is depicted in Figure 4 while the area under the curve is AUC = Referring to Figure 2, the function calls from Λ 13 point to Bluetooth-based calls giving a good indication for detecting malware. Due to this numbers, we can additionally state that most sophisticated 14 Symbian OS malwares obviously use at least Bluetooth for propagation. Revisiting drawbacks considering static analysis as presented from Moser et al. [8], it is not trivial to estimate the vulnerability of centroid to common obfuscation techniques. Since direct usage of machine code is very uncommon in Symbian OS applications 15, appearance of related initial function calls should be detected. Additionally, since our approach only relies on the function calls themselves and not call sequences, modification on the sequences do not harm our approach. The same applies for the Android platform, while our estimations of course still need to be validated. 6 Conclusion and Future Work We described a set of Symbian OS malware analyzed by a clustering method called centroid. This clustering method was validated with the analyzed malware set. Centroid is based on a static function call analysis and distinguishes 13 representing the intersected top 14 calls from Symbian OS mal- and software 14 being more sophisticated than malwares only overwriting files through exploiting features from the Symbian OS installation system 15 except drivers
7 Figure 4. ROC Curve for the centroid machine for varying alarm threshold β from (5). The area under the curve is AUC = malicious from benign program by a learning concept. Furthermore, we compared its quality parameters with some standard state-of-the-art learning algorithms and showed that it is competitive. Additionally, it is having the advantage of being lighter, which makes it more appropriate for the requirements of a smartphone platform. Moreover, we presented a attribute reduction method by which we successfully reduced dimensionality dramatically without significant loss of detection quality. We will continue with investigations on following topics: Examine different methods for improving classifiers for the requirements of mobile phone. Implement a first prototype. Extend our analysis on benign programs to create an malware detection system based on unsupervised learning, capable to learn a concept of normality and therefore capable to detect malware, which has been unknown until now. References [1] U. Bayer, A. Moser, C. Krügel, and E. Kirda. Dynamic analysis of malicious code function call. Journal in Computer Virology, 2(1):67 77, [2] J. Bergeron, M. Debbabi, J. Desharnais, M. M. Erhioui, Y. Lavoie, and N. Tawbi. Static detection of malicious code in executable programs. In Proceedings of the Symposium on Requirements Engineering for Information Security (SREIS 01), [3] M. Christodorescu and S. Jha. Static analysis of executables to detect malicious patterns. In Proceedings of the 12th USENI Security Symposium, pages , [4] C. Eagle. The IDA Pro Book: The Unofficial Guide to the World s Most Popular Disassembler. No Starch Press, San Francisco, CA, USA, [5] M. Egele, M. Szydlowski, E. Kirda, and C. Kruegel. Using static program analysis to aid intrusion detection. In R. Bschkes and P. Laskov, editors, DIMVA, volume 4064 of Lecture Notes in Computer Science, pages Springer, [6] F-Secure. Trojan:symbos/yxe, Feb symbos_yxe.shtml. [7] C. Kruegel, W. Robertson, and G. Vigna. Detecting kernellevel rootkits through binary analysis. In Proceedings of the Annual Computer Security Application Conference (AC- SAC), pages , Dec [8] A. Moser, C. Kruegel, and E. Kirda. Limits of static analysis for malware detection. In ACSAC, pages , [9] N. Provos. Improving host security with system call policies. In SSYM 03: Proceedings of the 12th conference on
8 USENI Security Symposium, pages 18 18, Berkeley, CA, USA, USENI Association. [10] F. Pukelsheim. The three sigma rule. The American Statistician, 48:88 91, [11] I. Rish. An empirical study of the naive bayes classifier. Technical report, IBM Research Division, [12] A. Schmidt and S. Albayrak. Malicious software for smartphones. Technical Report TUB-DAI 02/08-01, Technische Universität Berlin, DAI-Labor, Feb http: // [13] A.-D. Schmidt, R. Bye, H.-G. Schmidt, J. Clausen, O. Kiraz, K. Yüksel, A. Camtepe, and S. Albayrak. Static analysis of executables for collaborative malware detection on android. In IEEE International Congress on Communication (ICC 2009) - Communication and Information Systems Security Symposium, Dresden, Germany, June [14] B. Scholkopf, A. J. Smola, R. C. Williamson, and P. L. Bartlett. New support vector algorithms. Neural Computation, 12: , [15] Symantec. Spyware.FlexiSpy, Mar [16] T. Wang, C. Wu, and C. Hsieh. A virus prevention model based on static analysis and data mining methods. In Computer and Information Technology Workshops, pages , [17] C. Warrender, S. Forrest, and B. Pearlmutter. Detecting intrusions using system calls: alternative data models. In Proceedings of the 1999 IEEE Symposium on Security and Privacy, pages , 1999.
LASTLINE WHITEPAPER. Why Anti-Virus Solutions Based on Static Signatures Are Easy to Evade
LASTLINE WHITEPAPER Why Anti-Virus Solutions Based on Static Signatures Are Easy to Evade Abstract Malicious code is an increasingly important problem that threatens the security of computer systems. The
Full System Emulation:
Full System Emulation: Achieving Successful Automated Dynamic Analysis of Evasive Malware Christopher Kruegel Lastline, Inc. [email protected] 1 Introduction Automated malware analysis systems (or sandboxes)
Intrusion Detection via Machine Learning for SCADA System Protection
Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. [email protected] J. Jiang Department
Detection and mitigation of Web Services Attacks using Markov Model
Detection and mitigation of Web Services Attacks using Markov Model Vivek Relan [email protected] Bhushan Sonawane [email protected] Department of Computer Science and Engineering, University of Maryland,
Static Analysis of Executables for Collaborative Malware Detection on Android
Static of Executables for Collaborative Malware Detection on Android Aubrey-Derrick Schmidt, Rainer Bye, Hans-Gunther Schmidt, Jan Clausen, Osman Kiraz, Kamer Ali Yüksel, Seyit Ahmet Camtepe, and Sahin
LASTLINE WHITEPAPER. In-Depth Analysis of Malware
LASTLINE WHITEPAPER In-Depth Analysis of Malware Abstract Malware analysis is the process of determining the purpose and functionality of a given malware sample (such as a virus, worm, or Trojan horse).
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
LASTLINE WHITEPAPER. Large-Scale Detection of Malicious Web Pages
LASTLINE WHITEPAPER Large-Scale Detection of Malicious Web Pages Abstract Malicious web pages that host drive-by-download exploits have become a popular means for compromising hosts on the Internet and,
The Behavioral Analysis of Android Malware
, pp.41-47 http://dx.doi.org/10.14257/astl.2014.63.09 The Behavioral Analysis of Android Malware Fan Yuhui, Xu Ning Department of Computer and Information Engineering, Huainan Normal University, Huainan,
An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation
An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation Shanofer. S Master of Engineering, Department of Computer Science and Engineering, Veerammal Engineering College,
Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware
Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware Cumhur Doruk Bozagac Bilkent University, Computer Science and Engineering Department, 06532 Ankara, Turkey
Predicting Flight Delays
Predicting Flight Delays Dieterich Lawson [email protected] William Castillo [email protected] Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing
Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang
Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental
Defending Behind The Device Mobile Application Risks
Defending Behind The Device Mobile Application Risks Tyler Shields Product Manager and Strategist Veracode, Inc Session ID: MBS-301 Session Classification: Advanced Agenda The What The Problem Mobile Ecosystem
RBACS: Rootkit Behavioral Analysis and Classification System
2010 Third International Conference on Knowledge Discovery and Data Mining RBACS: Rootkit Behavioral Analysis and Classification System Desmond Lobo, Paul Watters and Xinwen Wu Internet Commerce Security
Host-based Intrusion Prevention System (HIPS)
Host-based Intrusion Prevention System (HIPS) White Paper Document Version ( esnhips 14.0.0.1) Creation Date: 6 th Feb, 2013 Host-based Intrusion Prevention System (HIPS) Few years back, it was relatively
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
ANDROID BASED MOBILE APPLICATION DEVELOPMENT and its SECURITY
ANDROID BASED MOBILE APPLICATION DEVELOPMENT and its SECURITY Suhas Holla #1, Mahima M Katti #2 # Department of Information Science & Engg, R V College of Engineering Bangalore, India Abstract In the advancing
KEITH LEHNERT AND ERIC FRIEDRICH
MACHINE LEARNING CLASSIFICATION OF MALICIOUS NETWORK TRAFFIC KEITH LEHNERT AND ERIC FRIEDRICH 1. Introduction 1.1. Intrusion Detection Systems. In our society, information systems are everywhere. They
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
Classifying Manipulation Primitives from Visual Data
Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if
Malware Detection Module using Machine Learning Algorithms to Assist in Centralized Security in Enterprise Networks
Malware Detection Module using Machine Learning Algorithms to Assist in Centralized Security in Enterprise Networks Priyank Singhal Student, Computer Engineering Sardar Patel Institute of Technology University
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
OS KERNEL MALWARE DETECTION USING KERNEL CRIME DATA MINING
OS KERNEL MALWARE DETECTION USING KERNEL CRIME DATA MINING MONISHA.T #1 and Mrs.UMA.S *2 # ME,PG Scholar,Department of CSE, SKR Engineering College,Poonamallee,Chennai,TamilNadu * ME,Assist.professor,
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
Visual Analysis of Malware Behavior Using Treemaps and Thread Graphs
Visual Analysis of Malware Behavior Using Treemaps and Thread Graphs Philipp Trinius Thorsten Holz Jan Göbel Felix C. Freiling Laboratory for Dependable Distributed Systems, University of Mannheim, Germany
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
Support Vector Machines for Dynamic Biometric Handwriting Classification
Support Vector Machines for Dynamic Biometric Handwriting Classification Tobias Scheidat, Marcus Leich, Mark Alexander, and Claus Vielhauer Abstract Biometric user authentication is a recent topic in the
Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
Detecting Internet Worms Using Data Mining Techniques
Detecting Internet Worms Using Data Mining Techniques Muazzam SIDDIQUI Morgan C. WANG Institute of Simulation & Training Department of Statistics and Actuarial Sciences University of Central Florida University
CAS : A FRAMEWORK OF ONLINE DETECTING ADVANCE MALWARE FAMILIES FOR CLOUD-BASED SECURITY
CAS : A FRAMEWORK OF ONLINE DETECTING ADVANCE MALWARE FAMILIES FOR CLOUD-BASED SECURITY ABHILASH SREERAMANENI DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING SEOUL NATIONAL UNIVERSITY OF SCIENCE AND TECHNOLOGY
PE Explorer. Heaventools. Malware Code Analysis Made Easy
Heaventools PE Explorer Data Sheet Malware Code Analysis Made Easy Reverse engineers within the anti-virus, vulnerability research and forensics companies face the challenge of analysing a large number
Novelty Detection in image recognition using IRF Neural Networks properties
Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,
D-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis]
An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis] Stephan Spiegel and Sahin Albayrak DAI-Lab, Technische Universität Berlin, Ernst-Reuter-Platz 7,
Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability
Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller
The Droid Knight: a silent guardian for the Android kernel, hunting for rogue smartphone malware applications
Virus Bulletin 2013 The Droid Knight: a silent guardian for the Android kernel, hunting for rogue smartphone malware applications Next Generation Intelligent Networks Research Center (nexgin RC) http://wwwnexginrcorg/
Machine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
Bisecting K-Means for Clustering Web Log data
Bisecting K-Means for Clustering Web Log data Ruchika R. Patil Department of Computer Technology YCCE Nagpur, India Amreen Khan Department of Computer Technology YCCE Nagpur, India ABSTRACT Web usage mining
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Automating Linux Malware Analysis Using Limon Sandbox Monnappa K A [email protected]
Automating Linux Malware Analysis Using Limon Sandbox Monnappa K A [email protected] A number of devices are running Linux due to its flexibility and open source nature. This has made Linux platform
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
PFP Technology White Paper
PFP Technology White Paper Summary PFP Cybersecurity solution is an intrusion detection solution based on observing tiny patterns on the processor power consumption. PFP is capable of detecting intrusions
Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Impact of Feature Selection on the Performance of ireless Intrusion Detection Systems
Regions in a circle. 7 points 57 regions
Regions in a circle 1 point 1 region points regions 3 points 4 regions 4 points 8 regions 5 points 16 regions The question is, what is the next picture? How many regions will 6 points give? There's an
Testing LTL Formula Translation into Büchi Automata
Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland
Support Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
Evaluating Host-based Anomaly Detection Systems: Application of The One-class SVM Algorithm to ADFA-LD
Evaluating Host-based Anomaly Detection Systems: Application of The One-class SVM Algorithm to ADFA-LD Miao Xie, Jiankun Hu and Jill Slay School of Engineering and Information Technology University of
Lecture Embedded System Security A. R. Sadeghi, @TU Darmstadt, 2011 2012 Introduction Mobile Security
Smartphones and their applications have become an integral part of information society Security and privacy protection technology is an enabler for innovative business models Recent research on mobile
Index Terms Domain name, Firewall, Packet, Phishing, URL.
BDD for Implementation of Packet Filter Firewall and Detecting Phishing Websites Naresh Shende Vidyalankar Institute of Technology Prof. S. K. Shinde Lokmanya Tilak College of Engineering Abstract Packet
Active Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
Email Spam Detection Using Customized SimHash Function
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email
Studying Security Weaknesses of Android System
, pp. 7-12 http://dx.doi.org/10.14257/ijsia.2015.9.3.02 Studying Security Weaknesses of Android System Jae-Kyung Park* and Sang-Yong Choi** *Chief researcher at Cyber Security Research Center, Korea Advanced
Knowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
INFORMATION SECURITY RISK ASSESSMENT UNDER UNCERTAINTY USING DYNAMIC BAYESIAN NETWORKS
INFORMATION SECURITY RISK ASSESSMENT UNDER UNCERTAINTY USING DYNAMIC BAYESIAN NETWORKS R. Sarala 1, M.Kayalvizhi 2, G.Zayaraz 3 1 Associate Professor, Computer Science and Engineering, Pondicherry Engineering
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Music Mood Classification
Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
Radware s Behavioral Server Cracking Protection
Radware s Behavioral Server Cracking Protection A DefensePro Whitepaper By Renaud Bidou Senior Security Specialist,Radware October 2007 www.radware.com Page - 2 - Table of Contents Abstract...3 Information
Web Forensic Evidence of SQL Injection Analysis
International Journal of Science and Engineering Vol.5 No.1(2015):157-162 157 Web Forensic Evidence of SQL Injection Analysis 針 對 SQL Injection 攻 擊 鑑 識 之 分 析 Chinyang Henry Tseng 1 National Taipei University
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
A Review on Zero Day Attack Safety Using Different Scenarios
Available online www.ejaet.com European Journal of Advances in Engineering and Technology, 2015, 2(1): 30-34 Review Article ISSN: 2394-658X A Review on Zero Day Attack Safety Using Different Scenarios
Regularized Logistic Regression for Mind Reading with Parallel Validation
Regularized Logistic Regression for Mind Reading with Parallel Validation Heikki Huttunen, Jukka-Pekka Kauppi, Jussi Tohka Tampere University of Technology Department of Signal Processing Tampere, Finland
Automating Mimicry Attacks Using Static Binary Analysis
Automating Mimicry Attacks Using Static Binary Analysis Christopher Kruegel and Engin Kirda Technical University Vienna [email protected], [email protected] Darren Mutz, William Robertson,
Detecting client-side e-banking fraud using a heuristic model
Detecting client-side e-banking fraud using a heuristic model Tim Timmermans [email protected] Jurgen Kloosterman [email protected] University of Amsterdam July 4, 2013 Tim Timmermans, Jurgen
Maximizing Precision of Hit Predictions in Baseball
Maximizing Precision of Hit Predictions in Baseball Jason Clavelli [email protected] Joel Gottsegen [email protected] December 13, 2013 Introduction In recent years, there has been increasing interest
Clustering through Decision Tree Construction in Geology
Nonlinear Analysis: Modelling and Control, 2001, v. 6, No. 2, 29-41 Clustering through Decision Tree Construction in Geology Received: 22.10.2001 Accepted: 31.10.2001 A. Juozapavičius, V. Rapševičius Faculty
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
Making the difference between read to output, and read to copy GOING BEYOND BASIC FILE AUDITING FOR DATA PROTECTION
Making the difference between read to output, and read to copy GOING BEYOND BASIC FILE AUDITING FOR DATA PROTECTION MOST OF THE IMPORTANT DATA LOSS VECTORS DEPEND ON COPYING files in order to compromise
Multiple Kernel Learning on the Limit Order Book
JMLR: Workshop and Conference Proceedings 11 (2010) 167 174 Workshop on Applications of Pattern Analysis Multiple Kernel Learning on the Limit Order Book Tristan Fletcher Zakria Hussain John Shawe-Taylor
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca [email protected] Spain Manuel Martín-Merino Universidad
This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
Bayesian Spam Detection
Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal Volume 2 Issue 1 Article 2 2015 Bayesian Spam Detection Jeremy J. Eberhardt University or Minnesota, Morris Follow this and additional
OUTLIER ANALYSIS. Data Mining 1
OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,
Supervised Feature Selection & Unsupervised Dimensionality Reduction
Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or
LASTLINE WHITEPAPER. The Holy Grail: Automatically Identifying Command and Control Connections from Bot Traffic
LASTLINE WHITEPAPER The Holy Grail: Automatically Identifying Command and Control Connections from Bot Traffic Abstract A distinguishing characteristic of bots is their ability to establish a command and
arxiv:1506.04135v1 [cs.ir] 12 Jun 2015
Reducing offline evaluation bias of collaborative filtering algorithms Arnaud de Myttenaere 1,2, Boris Golden 1, Bénédicte Le Grand 3 & Fabrice Rossi 2 arxiv:1506.04135v1 [cs.ir] 12 Jun 2015 1 - Viadeo
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 1, March, 2013 ISSN: 2320-8791 www.ijreat.
Intrusion Detection in Cloud for Smart Phones Namitha Jacob Department of Information Technology, SRM University, Chennai, India Abstract The popularity of smart phone is increasing day to day and the
Modeling System Calls for Intrusion Detection with Dynamic Window Sizes
Modeling System Calls for Intrusion Detection with Dynamic Window Sizes Eleazar Eskin Computer Science Department Columbia University 5 West 2th Street, New York, NY 27 [email protected] Salvatore
Programming Exercise 3: Multi-class Classification and Neural Networks
Programming Exercise 3: Multi-class Classification and Neural Networks Machine Learning November 4, 2011 Introduction In this exercise, you will implement one-vs-all logistic regression and neural networks
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
A Practical Analysis of Smartphone Security*
A Practical Analysis of Smartphone Security* Woongryul Jeon 1, Jeeyeon Kim 1, Youngsook Lee 2, and Dongho Won 1,** 1 School of Information and Communication Engineering, Sungkyunkwan University, Korea
SEMANTIC SECURITY ANALYSIS OF SCADA NETWORKS TO DETECT MALICIOUS CONTROL COMMANDS IN POWER GRID
SEMANTIC SECURITY ANALYSIS OF SCADA NETWORKS TO DETECT MALICIOUS CONTROL COMMANDS IN POWER GRID ZBIGNIEW KALBARCZYK EMAIL: [email protected] UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN JANUARY 2014
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm
Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the
LASTLINE WHITEPAPER. Using Passive DNS Analysis to Automatically Detect Malicious Domains
LASTLINE WHITEPAPER Using Passive DNS Analysis to Automatically Detect Malicious Domains Abstract The domain name service (DNS) plays an important role in the operation of the Internet, providing a two-way
Symantec's Secret Sauce for Mobile Threat Protection. Jon Dreyfus, Ellen Linardi, Matthew Yeo
Symantec's Secret Sauce for Mobile Threat Protection Jon Dreyfus, Ellen Linardi, Matthew Yeo 1 Agenda 1 2 3 4 Threat landscape and Mobile Insight overview What s unique about Mobile Insight Mobile Insight
Sonatype CLM Server - Dashboard. Sonatype CLM Server - Dashboard
Sonatype CLM Server - Dashboard i Sonatype CLM Server - Dashboard Sonatype CLM Server - Dashboard ii Contents 1 Introduction 1 2 Accessing the Dashboard 3 3 Viewing CLM Data in the Dashboard 4 3.1 Filters............................................
Dynamic Spyware Analysis
Dynamic Spyware Analysis Manuel Egele, Christopher Kruegel, Engin Kirda, Heng Yin, and Dawn Song Secure Systems Lab Technical University Vienna {pizzaman,chris,ek}@seclab.tuwien.ac.at Carnegie Mellon University
CHAPTER 1 INTRODUCTION
21 CHAPTER 1 INTRODUCTION 1.1 PREAMBLE Wireless ad-hoc network is an autonomous system of wireless nodes connected by wireless links. Wireless ad-hoc network provides a communication over the shared wireless
JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions
Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
A survey on Data Mining based Intrusion Detection Systems
International Journal of Computer Networks and Communications Security VOL. 2, NO. 12, DECEMBER 2014, 485 490 Available online at: www.ijcncs.org ISSN 2308-9830 A survey on Data Mining based Intrusion
Identification of File Integrity Requirement through Severity Analysis
Identification of File Integrity Requirement through Severity Analysis Zul Hilmi Abdullah a, Shaharudin Ismail a, Nur Izura Udzir b a Fakulti Sains dan Teknologi, Universiti Sains Islam Malaysia, Bandar
