MEF: Malicious Filter A UNIX Mail Filter that Detects Malicious Windows Executables

Size: px
Start display at page:

Download "MEF: Malicious Filter A UNIX Mail Filter that Detects Malicious Windows Executables"

Transcription

1 MEF: Malicious Filter A UNIX Mail Filter that Detects Malicious Windows Executables Matthew G. Schultz, Eleazar Eskin, and Salvatore J. Stolfo Computer Science Department, Columbia University Abstract We present Malicious Filter, MEF, a freely distributed malicious binary filter incorporated into Procmail that can detect malicious Windows attachments by integrating with a UNIX mail server. The system has three capabilities: detection of known and unknown malicious attachments, automatic propagation of detection models, and the ability to monitor the spread of malicious attachments. The system filters malicious attachments from s by using detection models obtained from data-mining over known malicious attachments. It leverages research in data mining applied to malicious executables which allows the detection of previously unseen, malicious attachments. These new malicious attachments are programs that are most likely undetectable by current virus scanners because detection signatures for them have not yet been generated. The system also allows for the automatic propagation of detection models from a central server. Finally, the system allows for monitoring and measurement of the spread of malicious attachments. The system will be released under GPL. 1 Introduction A serious security risk today is the propagation of malicious executables through attachments. A malicious executable is defined to be a program that performs a malicious function, such as compromising a system s security, damaging a system or obtaining sensitive information without the user s permission. Recently there have been some high profile incidents with malicious attachments such as the ILOVEYOU virus and its clones. These malicious attachments caused an incredible amount of damage in a very short time. The Malicious Filter (MEF) project provides a tool for the protection of systems against malicious attachments. A freely available, open source filter that operates from UNIX to detect malicious Windows binaries has many advantages. Operating from a UNIX host, the filter could automatically filter the each host receives. The UNIX server could either wrap the malicious with a warning addressed to the user, or it could block the . All of this could be done without the server s users having to scan attachments themselves or having to download updates for their virus scanners. This way the system administrator can be responsible for updating the filter instead of relying on end users. The standard approach to protecting against malicious s is to use a virus scanner. Commercial virus scanners can effectively detect known malicious executables, but unfortunately they can not detect unknown malicious executables. The reason for this is that most of these virus scanners are signature based. For each known malicious binary, the scanner contains a byte sequence that identifies the malicious binary. However, an unknown malicious binary, one without a pre-existent signature, will most likely go undetected. We built upon research at Columbia University on datamining methods to detect malicious binaries [5]. The idea is that by using data-mining, knowledge of known malicious executables can be generalized to detect unknown malicious executables. Data mining methods are ideal for this purpose because they detect patterns in large amounts of data, such as byte code, and use these patterns to detect future instances in similar data along with detecting known instances. Our framework used classifiers to detect malicious executables. A classifier is a rule set, or detection model, generated by the data mining algorithm that was trained over a given set of training data. The goal of this project is to design a freely distributed data mining based filter which integrates with Procmail s pre-existent security filter [1]. This filter s purpose is to detect both the set of known malicious binaries and a set of previously unseen, but similar malicious binaries. The system also provides a mechanism for automatically updating the detection models from a central server. Finally the filter reports statistics on malicious binaries that it processes which allows for the monitoring and measuring of the propagation of malicious attachments across hosts. The system will be released under GPL. 1

2 Since the methods we describe are probabilistic, we could also tell if a binary is borderline. A borderline binary is a program that has similar probabilities for both classes (i.e. could be a malicious executable or a benign program). If it is a border-line case then there is an option in the network filter to send a copy of the malicious executable to a central repository such as CERT. There, it can be examined by human experts. When combined with the automatic distribution of detection models, this can reduce the time a host is vulnerable to a malicious attachment. The detection model generation worked as follows. The binaries were first statically analyzed to extract bytesequences, and then the classifiers were trained over a subset of the data. This generated a detection model that was tested on a set of previously unseen data. We implemented a traditional, signature-based algorithm to compare their performance with the data mining algorithms. Using standard statistical cross-validation techniques, this framework had a detection rate of 97.76%, over double the detection rate of a signature-based scanner. The organization of the paper is as follows. We first present the how the system works and how it is integrated with Procmail. Secondly, we describe how the detection models are updated. Then we describe how the system can be used for monitoring the propagation of attachments. At the end, we summarize the research in data mining for creating detection models that can generalize to new and unknown malicious binaries, and report the accuracy and performance of the system. 2 Incorporation into Procmail MEF filters malicious attachments by replacing the standard virus filter found in Procmail with a data mining generated detection model. Procmail is invoked by the mail server to extract the attachment. Currently the mail server supported is sendmail. This filter first decodes the binary and then examines the binary using a data mining based model. The filter evaluates it by comparing it to all the byte strings found with it to the byte-sequences contained in the detection model. A probability of the binary being malicious is calculated, and if it is greater that its likelihood of being benign then the executable is labeled malicious. Otherwise the binary is labeled benign. Depending on the result of the filter, Procmail is used to either pass the mail untouched if the attachment is determined to be not malicious or warn the recipient of the mail that the attachment is malicious. 2.1 Borderline Cases Borderline binaries are binaries that have similar probabilities of being benign and malicious (e.g. 50% chance it is malicious, and 50% chance it is benign). These binaries can then be analyzed by experts to determine whether they are malicious or not, and subsequently included in the future generation of detection models. A simple metric to detect borderline cases and redirect them to an evaluation party would be to define a borderline case to be a case where the probability it is malicious is above some threshold. There is a tradeoff between the detection rate of the filter and the false positive rate. The detection rate is the percentage of malicious attachments detected, while the false positive rate is the percentage of the normal attachments labeled as malicious. This threshold can be set based on the policies of the host. Receiving borderline cases and updating the detection model is an important aspect of the data mining approach. The larger the data set used to generate models the more accurate the detection models are because borderline cases are executables that could potentially lower the detection and accuracy rates by being misclassified. 2.2 Update Algorithms After a number of borderline cases have been received, it may be necessary to generate a new detection model. This would be accomplished by retraining the data mining algorithm over the data set containing the borderline cases along with their correct classification. This would generate a new detection model that when combined with the old detection model would detect a larger set of malicious binaries. The system provides a mechanism for automatically combining detection models. Because of this we can only the portions of the models that have changed as updates. This is important because the detection models can get very large. However, we want to update the model with each new malicious attachment discovered. In order to avoid constantly sending a large model to the filters, we can just send the information about the new attachment and have it integrated with the original model. Combining detection models is possible because the underlying representation of the models is probabilistic. To generate a detection model, the algorithm counted of times that the byte string appeared in a malicious program versus the number of times that it appeared in a benign program. From these counts the algorithm computes the probability that an attachment is malicious in a method described later in the paper. In order to combine the models, we simply need to sum the counts of the old model with the new information. As shown in Figure 1, in Model A, the old detection model, a byte string occurred 99 times in the malicious class and 1 time in the benign class. In Model B, the update model, the same byte string was found 3 times in the malicious class and 4 times in the benign class. The combination of models A and B would state that the byte string occurred 102 times in the malicious class and 5 times in 2

3 Model A The byte string occurred in 99 malicious executables The byte string occurred in 1 benign executable Model B The byte string occurred in 3 malicious executables The byte string occurred in 4 benign executables Combined Model The byte string occurred in 102 malicious executables The byte string occurred in 5 benign executable Figure 1: Sample Combined Model resulting from applying the update model, B, to the old model, A. the benign class. The combination of A and B would be the new detection model after an update. The mechanism for propagating the detection models is through encrypted . The Procmail filter can receive the detection model through a secure sent to the mail server, and then automatically update the model. This will be addressed and formatted in such a way that the Procmail filter will process it and update the detection models. 3 Monitoring the Propagation of Attachments Tracking the propagation of attachments would be beneficial in identifying the origin of malicious executables, and in estimating a malicious attachments prevalence. The monitoring is done by having each host that is using the system log the malicious attachments, and the borderline attachments that are sent to and from the host. This logging may or may not contain who the sender or receiver of the mail was depending on the privacy policy of the host. In order to log the attachments, we need a way to obtain an identifier for each attachment. We do this by computing a hash function over the byte sequences in the binary obtaining a unique identifier. The logs of malicious attachments are sent back to the central server. Since there is a unique identifier for each binary, we can measure the propagation of the malicious binaries across hosts by examining their logs. From these logs we can estimate how many copies of each malicious binary are circulating the Internet. The current method for detailing the propagation of malicious executables is for an administrator to report an attack to an agency such as WildList [9]. The wild list is a list of the propagation of viruses in the wild and a list of the most prevalent viruses. This is not done automatically, but instead is based upon a report issued by an attacked host. Our method would reliably, and automatically detail a malicious executable s spread over the Internet. 4 Methodology for Building Data Mining Detection Models We gathered a large set of programs from public sources and separated the problem into two classes: malicious and benign executables. Each example in the data set is a Windows or MS-DOS format executable, although the framework we present is applicable to other formats. To standardize our data-set, we used MacAfee s [4] virus scanner and labeled our programs as either malicious or benign executables. We split the dataset into two subsets: the training set and the test set. The data mining algorithms used the training set while generating the rule sets, and after training we used a test set to test the accuracy of the classifiers over unseen examples. 4.1 Data Set The data set consisted of a total of 4,301 programs split into 3,301 malicious binaries and 1,000 clean programs. The malicious binaries consisted of viruses, Trojans, and cracker/network tools. There were no duplicate programs in the data set and every example in the set is labeled either malicious or benign by the commercial virus scanner. All labels are assumed to be correct. All programs were gathered either from FTP sites, or personal computers in the Data Mining Lab here at Columbia University. The entire data set is available off our public ftp site ftp://ftp.cs.columbia.edu/pub/mgs Detection Algorithms We statically extracted byte sequence features from each binary example for the algorithms to use to generate their detection models. Features in a data mining framework are properties extracted from each example in the data set, such as byte sequences, that a classifier uses to generate detection models. These features were then used by the algorithms to generate detection models. We used hexdump [7], an open source tool that transforms binary files into hexadecimal files. After we generated the hexdumps we had features in the form of Figure 2 where each line represents a short sequence of machine code instructions. 646e 776f 2e73 0a0d e 3c05 026c c e f a Figure 2: Example Set of Byte Sequence Features 3

4 4.3 Signature-Based Approach To compare our results with traditional methods we implemented a signature based method. First, we calculated the byte-sequences that were only found in the malicious executable class. These byte-sequences were then concatenated together to make a unique signature for each malicious executable example. Thus each malicious executable signature contained only byte-sequences found in the malicious executable class. To make the signature unique, the byte-sequences found in each example were concatenated together to form one signature. This was done because a byte-sequence that was only found in one class during training could possibly be found in the other class during testing [2], and lead to false positives when deployed. Since the virus scanner that we used to label the data set had been trained over every example in our data set, it was necessary to implement a similar signature-based method to compare with the data mining algorithms. There was no way to use an off-the-shelf virus scanner, and simulate the detection of new malicious executables because these commercial scanners contained signatures for all the malicious executables in our data set. In our tests the signature-based algorithm was only allowed to generate signatures over the set of training data to compare them to data mining based techniques. This allowed our data mining framework to be fairly compared to traditional scanners over new data. 4.4 Data Mining Approach The classifier we incorporated into Procmail was a Naive Bayes classifier [6]. A naive Bayes classifier computes the likelihood that a program is malicious given the features that are contained in the program. We assumed that there were similar byte sequences in malicious executables that differentiated them from benign programs, and the class of benign programs had similar sequences that differentiated them from the malicious executables. We used the Naive Bayes method to compute the probability of an executable being benign or malicious. The Naive Bayes method, however, required more than 1 GB of RAM to generate its detection model. To make the algorithm more efficient we divided the problem into smaller pieces that would fit in memory and trained a Naive Bayes algorithm over each of the subproblems. The subproblem was to classify based on every 6th line of machine code instead of every line in the binary. For this we trained six Naive Bayes classifiers so that every byte-sequence line in the training set had been trained over. We then used a voting algorithm that combined the outputs from the six Naive Bayes methods. The voting algorithm was then used to generate the detection model. 4.5 Preliminary Results To quantitatively express the performance of our method we show tables with the counts for true positives, true negatives, false positives, and false negatives. A true positive, TP, is an malicious example that is correctly tagged as malicious, and a true negative, TN, is a benign example that is correctly classified. A false positive, FP, is a benign program that has been mislabeled by an algorithm as a malicious program, while a false negative, FN, is a malicious executable that has been mis-classified as a benign program. We estimated our results over new executables by using 5-fold cross validation [3]. Cross validation is the standard method to estimate likely predictions over unseen data in Data Mining. For each set of binary profiles we partitioned the data into 5 equal size partitions. We used 4 of the partitions for training and then evaluated the rule set over the remaining partition. Then we repeated the process 5 times leaving out a different partition for testing each time. This gave us a reliable measure of our method s accuracy over unseen data. We averaged the results of these five tests to obtain a good measure of how the algorithm performs over the entire set. 4.6 New Executables To evaluate the algorithms over new executables, the algorithms generated their detection models over the set of training data and then tested their models over the set of test data. This was done five times in accordance with five fold cross validation. Shown in Table 1, the data mining algorithm had the highest detection rate of either method we tested, 97.76% compared with the signature based method s detection rate of 33.96%. Along with the higher detection rate the data mining method had a higher overall accuracy, 96.88% vs %. The false positive rate at 6.01% though was higher than the signature based method, 3.80%. For the algorithms we plotted the detection rate vs. false positive rate using ROC curves [10]. ROC (Receiver Operating Characteristic) curves are a way of visualizing the trade-offs between detection and false positive rates. In this instance, the ROC curves in Figure 3 show how the data mining method can be configured for different applications. For a false positive rate less than or equal to 1% the detection rate would be greater than 70%, and for a false positive rate greater than 8% the detection rate would be greater than 99%. 4.7 Known Executables To evaluate the performance of the algorithms over known executables the algorithms generated detection models over the entire set of data and then their performance was evalu- 4

5 Profile True True False False Detection False Positive Overall Type Positives (TP) Negatives (TN) Positives (FP) Negatives (FN) Rate Rate Accuracy Signature Method % 0% 49.31% Data Mining Method % 6.01% 96.88% Table 1: These are the results of classifying new malicious programs organized by algorithm and feature. Note the Data Mining Method had a higher detection rate and accuracy while the Signature based method had the lowest false positive rate. Profile True True False False Detection False Positive Overall Type Positives (TP) Negatives (TN) Positives (FP) Negatives (FN) Rate Rate Accuracy Signature Method % 0% 100% Data Mining Method % 0% 100% Table 2: Results of classifying known malicious programs organized by algorithm and feature. The Signature based method and the data mining method had the same results. Detection Rate False Positive Rate Data Mining Signature Based Figure 3: Data Mining ROC. Note that the Data Mining method has a higher detection rate than the signature method with a greater than 0.5% false positive rate. ated by testing over the same examples they had seen during training. As shown in Table 2, over known executables the methods had the same performance. Each had an accuracy of 100% over the entire data set - correctly classifying all the malicious examples as malicious and all the benign examples as benign. The data mining algorithm detected 100% of known executables because it was merged with the signature based method. Without merging, the data mining algorithm detected 99.87% of the malicious examples and misclassified 2% of the benign binaries as malicious. However, we have the signatures for the binaries that the data mining algorithm misclassified, and the algorithm can include those signatures in the detection model without lowering accuracy over unknown binaries. After the signatures for the executables that were misclassified during training had been generated and included in the detection model, the data mining model had a 100% accuracy rate over known executables. 5 Data Mining Performance The system required different time and space complexities for model generation and deployment. 5.1 Training In order for the data mining algorithm to quickly generate the models, it required all calculations to be done in memory. Using the algorithm took in excess of a gigabyte of RAM. By splitting the data into smaller pieces, the algorithm could be done in memory with a small loss ( 1.5%) in accuracy and detection rate. The calculations could then be done on a machine with less than 1 GB of RAM. This splitting allowed the model generation to be performed in less than two hours. 5.2 During Deployment Online evaluation of executables can take place with much less memory required. The models could be stored in smaller pieces stored on the file system that would facilitate quicker analysis than having only one large model that needed to be loaded into memory for evaluation of each binary. Also by not loading the model into memory during evaluation we also avoid the problem of computers with small amounts of memory ( 128 MB), and taking memory away from the other processes running on the server. Since the algorithm analyzes each line of byte code contained in the binary, the time required for each evaluation varies. However, this can be done efficiently. 6 Conclusions The first contribution that we presented in this paper was a freely distributed filter for Procmail that detected known malicious Windows executables and previously unknown malicious Windows binaries from UNIX. The detection 5

6 rate of new executables was over twice that of the traditional signature based methods, 97.76% compared with 33.96%. In addition the system we presented has the capability of automatically receiving updated detection models and the ability to monitor the propagation of malicious attachments. One problem with traditional, signature-based methods is that in order to detect a new malicious executable, the program needs to be examined and a signature extracted from it and included in the anti-virus database. The difficulty with this method is that during the time required for a malicious program to be identified, analyzed and signatures to be distributed, systems are at risk from that program. Our methods may provide a defense during that time. With a low false positive rate the inconvenience to the end user would be minimal while providing ample defense during the time before an update of models is available. Virus Scanners are updated about every month, and new malicious executables are created in that time (8 10 a day [8]). Our method would catch roughly of those new malicious executables without the need for an update whereas traditional methods would catch only Our method more than doubles the detection rate of signature based methods. John F. Morar. Anatomy of a Commercial-Grade Immune System, IBM Research White Paper, Anatomy/anatomy.html [9] Wildlist Organization. Virus description of viruses in the wild. Online Publication, [10] Zou KH, Hall WJ, and Shapiro D. Smooth nonparametric ROC curves for continuous diagnostic tests, Statistics in Medicine, 1997 References [1] John Hardin, Enhancing Security With Procmail, Online publication, [2] Jeffrey O. Kephart, and William C. Arnold. Automatic Extraction of Computer Virus Signatures, 4th Virus Bulletin International Conference, pages , [3] Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection, IJCAI, [4] MacAfee. Homepage - MacAfee.com, Online publication, [5] MEF Group, Malicious Filter Group, Online publication, [6] D.Michie, D.J.Spiegelhalter, and C.C.Taylor. Machine learning of rules and trees. In Machine Learning, Neural and Statistical Classification, Ellis Horwood, New York, pages 50-83,1994. [7] Peter Miller. Hexdump, Online publication, millerp/hexdump.html [8] Steve R. White, Morton Swimmer, Edward J. Pring, William C. Arnold, David M. Chess, and 6

MEF: Malicious Email Filter A UNIX Mail Filter that Detects Malicious Windows Executables

MEF: Malicious Email Filter A UNIX Mail Filter that Detects Malicious Windows Executables MEF: Malicious Email Filter A UNIX Mail Filter that Detects Malicious Windows Executables Matthew G. Schultz and Eleazar Eskin Department of Computer Science Columbia University {mgs,eeskin}@cs.columbia.edu

More information

Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware

Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware Cumhur Doruk Bozagac Bilkent University, Computer Science and Engineering Department, 06532 Ankara, Turkey

More information

2 Background. 3 Methodology

2 Background. 3 Methodology Data Mining Methods for Detection of New Malicious Executables Matthew G. Schultz and Eleazar Eskin Department of Computer Science Columbia University mgs,eeskin@cs.columbia.edu Erez Zadok Department of

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577 T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

CSC574 - Computer and Network Security Module: Intrusion Detection

CSC574 - Computer and Network Security Module: Intrusion Detection CSC574 - Computer and Network Security Module: Intrusion Detection Prof. William Enck Spring 2013 1 Intrusion An authorized action... that exploits a vulnerability... that causes a compromise... and thus

More information

1. Classification problems

1. Classification problems Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Detecting Internet Worms Using Data Mining Techniques

Detecting Internet Worms Using Data Mining Techniques Detecting Internet Worms Using Data Mining Techniques Muazzam SIDDIQUI Morgan C. WANG Institute of Simulation & Training Department of Statistics and Actuarial Sciences University of Central Florida University

More information

Predicting Flight Delays

Predicting Flight Delays Predicting Flight Delays Dieterich Lawson jdlawson@stanford.edu William Castillo will.castillo@stanford.edu Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

L25: Ensemble learning

L25: Ensemble learning L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna

More information

Benefits of Machine Learning. with Behavioral Analysis in Detection of Advanced Persistent Threats WHITE PAPER

Benefits of Machine Learning. with Behavioral Analysis in Detection of Advanced Persistent Threats WHITE PAPER Benefits of Machine Learning with Behavioral Analysis in Detection of Advanced Persistent Threats WHITE PAPER Overview The Evolution of Advanced Persistent Threat Detection Computer viruses have plagued

More information

Big Data & Scripting Part II Streaming Algorithms

Big Data & Scripting Part II Streaming Algorithms Big Data & Scripting Part II Streaming Algorithms 1, Counting Distinct Elements 2, 3, counting distinct elements problem formalization input: stream of elements o from some universe U e.g. ids from a set

More information

Modeling System Calls for Intrusion Detection with Dynamic Window Sizes

Modeling System Calls for Intrusion Detection with Dynamic Window Sizes Modeling System Calls for Intrusion Detection with Dynamic Window Sizes Eleazar Eskin Computer Science Department Columbia University 5 West 2th Street, New York, NY 27 eeskin@cs.columbia.edu Salvatore

More information

Evaluation & Validation: Credibility: Evaluating what has been learned

Evaluation & Validation: Credibility: Evaluating what has been learned Evaluation & Validation: Credibility: Evaluating what has been learned How predictive is a learned model? How can we evaluate a model Test the model Statistical tests Considerations in evaluating a Model

More information

Advanced analytics at your hands

Advanced analytics at your hands 2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

More information

Zscaler Cloud Web Gateway Test

Zscaler Cloud Web Gateway Test Zscaler Cloud Web Gateway Test A test commissioned by Zscaler, Inc. and performed by AV-TEST GmbH. Date of the report: April15 th, 2016 Executive Summary In March 2016, AV-TEST performed a review of the

More information

E-mail Spam Classification With Artificial Neural Network and Negative Selection Algorithm

E-mail Spam Classification With Artificial Neural Network and Negative Selection Algorithm E-mail Spam Classification With Artificial Neural Network and Negative Selection Algorithm Ismaila Idris Dept of Cyber Security Science, Federal University of Technology, Minna, Nigeria. Idris.ismaila95@gmail.com

More information

Active Threat Control

Active Threat Control Active Threat Control Proactive Protection Against New and Emerging Threats Why You Should Read this White Paper The unprecedented rise of new threats has deemed traditional security mechanisms both ineffective

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

ALDR: A New Metric for Measuring Effective Layering of Defenses

ALDR: A New Metric for Measuring Effective Layering of Defenses ALDR: A New Metric for Measuring Effective Layering of Defenses Nathaniel Boggs Department of Computer Science Columbia University boggs@cs.columbia.edu Salvatore J. Stolfo Department of Computer Science

More information

Performance Evaluation of Intrusion Detection Systems

Performance Evaluation of Intrusion Detection Systems Performance Evaluation of Intrusion Detection Systems Waleed Farag & Sanwar Ali Department of Computer Science at Indiana University of Pennsylvania ABIT 2006 Outline Introduction: Intrusion Detection

More information

Feature Subset Selection in E-mail Spam Detection

Feature Subset Selection in E-mail Spam Detection Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Beating the MLB Moneyline

Beating the MLB Moneyline Beating the MLB Moneyline Leland Chen llxchen@stanford.edu Andrew He andu@stanford.edu 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series

More information

Speedy Signature Based Intrusion Detection System Using Finite State Machine and Hashing Techniques

Speedy Signature Based Intrusion Detection System Using Finite State Machine and Hashing Techniques www.ijcsi.org 387 Speedy Signature Based Intrusion Detection System Using Finite State Machine and Hashing Techniques Utkarsh Dixit 1, Shivali Gupta 2 and Om Pal 3 1 School of Computer Science, Centre

More information

Email Image Control. Administrator Guide

Email Image Control. Administrator Guide Email Image Control Administrator Guide Image Control Administrator Guide Documentation version: 1.0 Legal Notice Legal Notice Copyright 2013 Symantec Corporation. All rights reserved. Symantec, the Symantec

More information

Towards better accuracy for Spam predictions

Towards better accuracy for Spam predictions Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial

More information

Application of Data Mining Techniques in Intrusion Detection

Application of Data Mining Techniques in Intrusion Detection Application of Data Mining Techniques in Intrusion Detection LI Min An Yang Institute of Technology leiminxuan@sohu.com Abstract: The article introduced the importance of intrusion detection, as well as

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Cryptography and Network Security Chapter 21. Malicious Software. Backdoor or Trapdoor. Logic Bomb 4/19/2010. Chapter 21 Malicious Software

Cryptography and Network Security Chapter 21. Malicious Software. Backdoor or Trapdoor. Logic Bomb 4/19/2010. Chapter 21 Malicious Software Cryptography and Network Security Chapter 21 Fifth Edition by William Stallings Chapter 21 Malicious Software What is the concept of defense: The parrying of a blow. What is its characteristic feature:

More information

Computer Security DD2395

Computer Security DD2395 Computer Security DD2395 http://www.csc.kth.se/utbildning/kth/kurser/dd2395/dasakh11/ Fall 2011 Sonja Buchegger buc@kth.se Lecture 7 Malicious Software DD2395 Sonja Buchegger 1 Course Admin Lab 2: - prepare

More information

Email Security and Spam Prevention. March 25, 2004 Tim Faltemier Saurabh Jain

Email Security and Spam Prevention. March 25, 2004 Tim Faltemier Saurabh Jain Email Security and Spam Prevention March 25, 2004 Tim Faltemier Saurabh Jain Email Spam (Impact ) Spam- Unsolicited Email that lack affirmative consent from Receiver. America Online estimated that between

More information

System Specification. Author: CMU Team

System Specification. Author: CMU Team System Specification Author: CMU Team Date: 09/23/2005 Table of Contents: 1. Introduction...2 1.1. Enhancement of vulnerability scanning tools reports 2 1.2. Intelligent monitoring of traffic to detect

More information

Model Deployment. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/

Model Deployment. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/ Model Deployment Dr. Saed Sayad University of Toronto 2010 saed.sayad@utoronto.ca http://chem-eng.utoronto.ca/~datamining/ 1 Model Deployment Creation of the model is generally not the end of the project.

More information

Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

More information

STANDARDISATION AND CLASSIFICATION OF ALERTS GENERATED BY INTRUSION DETECTION SYSTEMS

STANDARDISATION AND CLASSIFICATION OF ALERTS GENERATED BY INTRUSION DETECTION SYSTEMS STANDARDISATION AND CLASSIFICATION OF ALERTS GENERATED BY INTRUSION DETECTION SYSTEMS Athira A B 1 and Vinod Pathari 2 1 Department of Computer Engineering,National Institute Of Technology Calicut, India

More information

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs 1 Big Data Analytics Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs Montevideo, 22 nd November 4 th December, 2015 INFORMATIQUE

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Combining Global and Personal Anti-Spam Filtering

Combining Global and Personal Anti-Spam Filtering Combining Global and Personal Anti-Spam Filtering Richard Segal IBM Research Hawthorne, NY 10532 Abstract Many of the first successful applications of statistical learning to anti-spam filtering were personalized

More information

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

A Review of Anomaly Detection Techniques in Network Intrusion Detection System A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

Outline. Introduction. State-of-the-art Forensic Methods. Hardware-based Workload Forensics. Experimental Results. Summary. OS level Hypervisor level

Outline. Introduction. State-of-the-art Forensic Methods. Hardware-based Workload Forensics. Experimental Results. Summary. OS level Hypervisor level Outline Introduction State-of-the-art Forensic Methods OS level Hypervisor level Hardware-based Workload Forensics Process Reconstruction Experimental Results Setup Result & Overhead Summary 1 Introduction

More information

Malware Detection Module using Machine Learning Algorithms to Assist in Centralized Security in Enterprise Networks

Malware Detection Module using Machine Learning Algorithms to Assist in Centralized Security in Enterprise Networks Malware Detection Module using Machine Learning Algorithms to Assist in Centralized Security in Enterprise Networks Priyank Singhal Student, Computer Engineering Sardar Patel Institute of Technology University

More information

IBM Managed Security Services (Cloud Computing) hosted e-mail and Web security - express managed Web security

IBM Managed Security Services (Cloud Computing) hosted e-mail and Web security - express managed Web security IBM Managed Security Services (Cloud Computing) hosted e-mail and Web security - express managed Web security INTC-8608-01 CE 12-2010 Page 1 of 8 Table of Contents 1. Scope of Services...3 2. Definitions...3

More information

An Overview of Predictive Analytics for Practitioners. Dean Abbott, Abbott Analytics

An Overview of Predictive Analytics for Practitioners. Dean Abbott, Abbott Analytics An Overview of Predictive Analytics for Practitioners Dean Abbott, Abbott Analytics Thank You Sponsors Empower users with new insights through familiar tools while balancing the need for IT to monitor

More information

Topics. Virus Protection and Intrusion Detection. What is a Virus? Three related ideas

Topics. Virus Protection and Intrusion Detection. What is a Virus? Three related ideas Virus Protection and Intrusion Detection John Mitchell Topics u Trojans, worms, and viruses u Virus protection Virus scanning methods u Detecting system compromise Tripwire u Detecting system and network

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

Reputation based Security. Vijay Seshadri Zulfikar Ramzan Carey Nachenberg

Reputation based Security. Vijay Seshadri Zulfikar Ramzan Carey Nachenberg Reputation based Security Vijay Seshadri Zulfikar Ramzan Carey Nachenberg Agenda Reputation Based Security The Problem Reputation Concept Implementing Reputation Deploying Reputation Conclusion 2 The Problem

More information

Identifying Broken Business Processes

Identifying Broken Business Processes Identifying Broken Business Processes A data-centric approach to defining, identifying, and enforcing protection of sensitive documents at rest, in motion, and in use 6/07 I www.vericept.com Abstract The

More information

Endpoint Business Products Testing Report. Performed by AV-Test GmbH

Endpoint Business Products Testing Report. Performed by AV-Test GmbH Business Products Testing Report Performed by AV-Test GmbH January 2011 1 Business Products Testing Report - Performed by AV-Test GmbH Executive Summary Overview During November 2010, AV-Test performed

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Evaluating Intrusion Detection Systems without Attacking your Friends: The 1998 DARPA Intrusion Detection Evaluation

Evaluating Intrusion Detection Systems without Attacking your Friends: The 1998 DARPA Intrusion Detection Evaluation Evaluating Intrusion Detection Systems without Attacking your Friends: The 1998 DARPA Intrusion Detection Evaluation R. K. Cunningham, R. P. Lippmann, D. J. Fried, S. L. Garfinkel, I. Graf, K. R. Kendall,

More information

Insight. Security Response. Deployment Best Practices

Insight. Security Response. Deployment Best Practices Insight Deployment Best Practices Overview Symantec Insight is a reputation-based security technology that leverages the anonymous software adoption patterns of Symantec s hundreds of millions of users

More information

Cisco IPS Tuning Overview

Cisco IPS Tuning Overview Cisco IPS Tuning Overview Overview Increasingly sophisticated attacks on business networks can impede business productivity, obstruct access to applications and resources, and significantly disrupt communications.

More information

F-Secure Internet Security 2014 Data Transfer Declaration

F-Secure Internet Security 2014 Data Transfer Declaration F-Secure Internet Security 2014 Data Transfer Declaration The product s impact on privacy and bandwidth usage F-Secure Corporation April 15 th 2014 Table of Contents Version history... 3 Abstract... 3

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Data Mining Classification: Decision Trees

Data Mining Classification: Decision Trees Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous

More information

Website Maintenance Information For My Clients Bob Spies, Flying Seal Systems, LLC Updated: 08- Nov- 2015

Website Maintenance Information For My Clients Bob Spies, Flying Seal Systems, LLC Updated: 08- Nov- 2015 Website Maintenance Information For My Clients Bob Spies, Flying Seal Systems, LLC Updated: 08- Nov- 2015 This document has several purposes: To explain what website maintenance is and why it's critical

More information

Chapter 20: Data Analysis

Chapter 20: Data Analysis Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

More information

Learning to Detect and Classify Malicious Executables in the Wild

Learning to Detect and Classify Malicious Executables in the Wild Journal of Machine Learning Research 7 (2006) 2721-2744 Submitted 3/06; Revised 9/06; Published 12/06 Learning to Detect and Classify Malicious Executables in the Wild J. Zico Kolter KOLTER@CS.STANFORD.EDU

More information

An Evaluation of Machine Learning Method for Intrusion Detection System Using LOF on Jubatus

An Evaluation of Machine Learning Method for Intrusion Detection System Using LOF on Jubatus An Evaluation of Machine Learning Method for Intrusion Detection System Using LOF on Jubatus Tadashi Ogino* Okinawa National College of Technology, Okinawa, Japan. * Corresponding author. Email: ogino@okinawa-ct.ac.jp

More information

Stellar Phoenix Exchange Server Backup

Stellar Phoenix Exchange Server Backup Stellar Phoenix Exchange Server Backup Version 1.0 Installation Guide Introduction This is the first release of Stellar Phoenix Exchange Server Backup tool documentation. The contents will be updated periodically

More information

(Self-Study) Identify How to Protect Your Network Against Viruses

(Self-Study) Identify How to Protect Your Network Against Viruses SECTION 24 (Self-Study) Identify How to Protect Your Network Against Viruses The following objective will be tested: Describe What You Can Do to Prevent a Virus Attack In this section you learn about viruses

More information

On Attacking Statistical Spam Filters

On Attacking Statistical Spam Filters On Attacking Statistical Spam Filters Gregory L. Wittel and S. Felix Wu Department of Computer Science University of California, Davis One Shields Avenue, Davis, CA 95616 USA Paper review by Deepak Chinavle

More information

2. From a control perspective, the PRIMARY objective of classifying information assets is to:

2. From a control perspective, the PRIMARY objective of classifying information assets is to: MIS5206 Week 13 Your Name Date 1. When conducting a penetration test of an organization's internal network, which of the following approaches would BEST enable the conductor of the test to remain undetected

More information

How To Protect A Network From Attack From A Hacker (Hbss)

How To Protect A Network From Attack From A Hacker (Hbss) Leveraging Network Vulnerability Assessment with Incident Response Processes and Procedures DAVID COLE, DIRECTOR IS AUDITS, U.S. HOUSE OF REPRESENTATIVES Assessment Planning Assessment Execution Assessment

More information

PE Explorer. Heaventools. Malware Code Analysis Made Easy

PE Explorer. Heaventools. Malware Code Analysis Made Easy Heaventools PE Explorer Data Sheet Malware Code Analysis Made Easy Reverse engineers within the anti-virus, vulnerability research and forensics companies face the challenge of analysing a large number

More information

IBM Express Managed Security Services for Email Security. Anti-Spam Administrator s Guide. Version 5.32

IBM Express Managed Security Services for Email Security. Anti-Spam Administrator s Guide. Version 5.32 IBM Express Managed Security Services for Email Security Anti-Spam Administrator s Guide Version 5.32 Table of Contents 1. Service overview... 3 1.1 Welcome... 3 1.2 Anti-Spam (AS) features... 3 1.3 How

More information

Sentiment analysis using emoticons

Sentiment analysis using emoticons Sentiment analysis using emoticons Royden Kayhan Lewis Moharreri Steven Royden Ware Lewis Kayhan Steven Moharreri Ware Department of Computer Science, Ohio State University Problem definition Our aim was

More information

Spam Testing Methodology Opus One, Inc. March, 2007

Spam Testing Methodology Opus One, Inc. March, 2007 Spam Testing Methodology Opus One, Inc. March, 2007 This document describes Opus One s testing methodology for anti-spam products. This methodology has been used, largely unchanged, for four tests published

More information

from Larson Text By Susan Miertschin

from Larson Text By Susan Miertschin Decision Tree Data Mining Example from Larson Text By Susan Miertschin 1 Problem The Maximum Miniatures Marketing Department wants to do a targeted mailing gpromoting the Mythic World line of figurines.

More information

Fuzzy Network Profiling for Intrusion Detection

Fuzzy Network Profiling for Intrusion Detection Fuzzy Network Profiling for Intrusion Detection John E. Dickerson (jedicker@iastate.edu) and Julie A. Dickerson (julied@iastate.edu) Electrical and Computer Engineering Department Iowa State University

More information

Data Management Policies. Sage ERP Online

Data Management Policies. Sage ERP Online Sage ERP Online Sage ERP Online Table of Contents 1.0 Server Backup and Restore Policy... 3 1.1 Objectives... 3 1.2 Scope... 3 1.3 Responsibilities... 3 1.4 Policy... 4 1.5 Policy Violation... 5 1.6 Communication...

More information

May 11, 2011. (Revision 4) Ron Gula Chief Technology Officer

May 11, 2011. (Revision 4) Ron Gula Chief Technology Officer Correlating IDS Alerts with Vulnerability Information May 11, 2011 (Revision 4) Ron Gula Chief Technology Officer Copyright 2011. Tenable Network Security, Inc. All rights reserved. Tenable Network Security

More information

LASTLINE WHITEPAPER. In-Depth Analysis of Malware

LASTLINE WHITEPAPER. In-Depth Analysis of Malware LASTLINE WHITEPAPER In-Depth Analysis of Malware Abstract Malware analysis is the process of determining the purpose and functionality of a given malware sample (such as a virus, worm, or Trojan horse).

More information

Digital Identity & Authentication Directions Biometric Applications Who is doing what? Academia, Industry, Government

Digital Identity & Authentication Directions Biometric Applications Who is doing what? Academia, Industry, Government Digital Identity & Authentication Directions Biometric Applications Who is doing what? Academia, Industry, Government Briefing W. Frisch 1 Outline Digital Identity Management Identity Theft Management

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

More information

ESET Mail Security 4. User Guide. for Microsoft Exchange Server. Microsoft Windows 2000 / 2003 / 2008

ESET Mail Security 4. User Guide. for Microsoft Exchange Server. Microsoft Windows 2000 / 2003 / 2008 ESET Mail Security 4 for Microsoft Exchange Server User Guide Microsoft Windows 2000 / 2003 / 2008 Content 1. Introduction...4 1.1 System requirements... 4 1.2 Methods Used... 4 1.2.1 Mailbox scanning

More information

Network Security: From Firewalls to Internet Critters Some Issues for Discussion

Network Security: From Firewalls to Internet Critters Some Issues for Discussion Network Security: From Firewalls to Internet Critters Some Issues for Discussion Slide 1 Presentation Contents!Firewalls!Viruses!Worms and Trojan Horses!Securing Information Servers Slide 2 Section 1:

More information

Reasoning Component Architecture

Reasoning Component Architecture Architecture of a Spam Filter Application By Avi Pfeffer A spam filter consists of two components. In this article, based on my book Practical Probabilistic Programming, first describe the architecture

More information

Incident Response Procedures

Incident Response Procedures Table of Contents Procedures Tony Arnold 26/11/06 1. Introduction...1 2. Organisation...1 3. Reported Incidents...2 3.1 Reporting...2 3.2 Incidents...2 3.3 Blocking...3 3.4 Investigations...3 3.5 Resolving...3

More information

Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

More information

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework Jakrarin Therdphapiyanak Dept. of Computer Engineering Chulalongkorn University

More information

Performance Measures in Data Mining

Performance Measures in Data Mining Performance Measures in Data Mining Common Performance Measures used in Data Mining and Machine Learning Approaches L. Richter J.M. Cejuela Department of Computer Science Technische Universität München

More information

ANTI-VIRUS POLICY OCIO-6006-09 TABLE OF CONTENTS

ANTI-VIRUS POLICY OCIO-6006-09 TABLE OF CONTENTS OCIO-6006-09 Date of Issuance: May 22, 2009 Effective Date: May 22, 2009 Review Date: Section I. Purpose II. Authority III. Scope IV. Definitions V. Policy VI. Roles and Responsibilities VII. Exceptions

More information

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences

More information

IBM Endpoint Manager Product Introduction and Overview

IBM Endpoint Manager Product Introduction and Overview IBM Endpoint Manager Product Introduction and Overview David Harsent Technical Specialist Unified Endpoint IBM Endpoint Manager and IBM MobileFirst Protect (MaaS360) Any device. Identify and respond to

More information

Security Camp Conference Fine Art of Balancing Security & Privacy

Security Camp Conference Fine Art of Balancing Security & Privacy Security Camp Conference Fine Art of Balancing Security & Privacy Kim Bilderback AT&T Director GovEd Cybersecurity Services kb7459@att.com August 21, 2014 Cybersecurity - The Threats Increase AT&T DDoS

More information

KEITH LEHNERT AND ERIC FRIEDRICH

KEITH LEHNERT AND ERIC FRIEDRICH MACHINE LEARNING CLASSIFICATION OF MALICIOUS NETWORK TRAFFIC KEITH LEHNERT AND ERIC FRIEDRICH 1. Introduction 1.1. Intrusion Detection Systems. In our society, information systems are everywhere. They

More information

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next

More information

Breach Found. Did It Hurt?

Breach Found. Did It Hurt? ANALYST BRIEF Breach Found. Did It Hurt? INCIDENT RESPONSE PART 2: A PROCESS FOR ASSESSING LOSS Authors Christopher Morales, Jason Pappalexis Overview Malware infections impact every organization. Many

More information