Intelligent Agents and Fraud Detection

Size: px
Start display at page:

Download "Intelligent Agents and Fraud Detection"

Transcription

1 Intelligent Agents and Fraud Detection Name: Jia Wu and Jongwoo Park 1. Introduction Frauds have plagued telecommunication industries, financial institutions and other organizations for a long time. The types of frauds addressed in this paper include cellular communication frauds, credit card transaction frauds, and computer intrusions. These frauds cost the businesses millions of dollars per year. As a result, fraud detection has become an important and urgent task for these businesses. At present a number of methods have been implemented to detect frauds, from both statistical approaches (e.g. data mining) and hardware approaches (e.g. firewalls, smart cards). Currently, data mining is a popular way to combat frauds because of its effectiveness. Hand et al define that data mining is a well-defined procedure that takes data as input and produces output in the forms of models or patterns. In other words, the task of data mining is to analyze a massive amount of data and to extract some usable information that we can interpret for future uses. In doing so, we have to define the clear goal of data mining, and find out the right structure of possible model or patterns that fit to the given data set. Once we have the right model for the data, we can use the model for predicting future events by classifying the data. In terms of data mining, fraud detection can be understood as the classification of the data. Input data is analyzed with the appropriate model and determined whether it implies any fraudulent activities or not. A well-defined classification model is developed by recognizing the patterns of former fraudulent behaviors. Then the model can be used to predict any suspicious activities implied by the new data set. One limitation of using data mining alone in fraud detection is its efficiency problem. Data mining and model construction require a lot of time, which prohibits it to detect frauds in real time. This is a serious drawback since, in many occasions such as online credit card transactions, we need to detect fraudulent activities in a very short period of time. Otherwise, the loss could be huge. 1

2 With the rapid development of information technologies, many new methods that exploit the power of IT to detect frauds have been created. One of these recent methods is to use intelligent agents for fraud detection, which incorporates both computer technologies and data mining knowledge. In this paper we examine the use of intelligent agents in fraud detection. By intelligent agents we mean computer programs that can act on behalf of a person to do various jobs. Intelligent agents can automate a large portion of the fraud detection process and require little human intervention. Additionally, intelligent agents do not stick to one model or rule. They can construct new models and rules for fraud detection with their machine learning capabilities. It would be harder to deceive intelligent agents than other computer programs for fraud detection. Besides, in a multi-agent system, many intelligent agents can work in parallel and corporate with each other. This not only accelerates the detection process but also increases the detection accuracy. Moreover, intelligent agents can be deployed online for real-time detection. It is an extremely desirable feature for online credit card fraud detection and network intrusion detection. The rest of the paper is organized as follows. Section 2 examines a variety of frauds, namely cellular frauds, credit card transaction frauds, and computer intrusions. Section 3 discusses various data mining algorithms for detect frauds and one pattern comparison algorithm. Section 4 describes two different types of intelligent agents. Section 5 gives some applications of the implementation of intelligent agents for fraud detection. Section 6 is our proposed further research areas regarding this topic. We attempt to apply intelligent agents for fraud detection in continuous auditing. And section 7 is the conclusion part. 2. Types of Frauds 2.1 Frauds in Mobile Communications 2

3 In the United States, frauds in mobile communications cost the industry hundreds of millions of dollars per year (Walters and Wilkinson 1994, Steward 1997). It is easy for criminals to commit frauds and hard to trace them due to the nature of mobile communication networks. One of the most epidemic and costly frauds in this area is the cloning fraud. A mobile phone is identified by two numbers, mobile identification number (MIN) and electronic serial number (ESN). Cloning occurs when a criminal makes use of a mobile communication scanner to steal MIN and ESN from a legitimate subscriber and program them into another phone. Afterwards, the illegitimate user can make unlimited calls which will be billed to the legitimate user. On one hand, cloning phones attracted many illicit users because the calls are free and untraceable. On the other hand, the fraudulent usage of cloning phones costs millions of dollars of revenue losses for mobile communication service providers. In addition, because calls made from these cloned phones are very difficult to trace, criminals and terrorists can take advantage of this to perpetrate more serious crimes. 2.2 Frauds in Credit Card Transactions Credit card frauds have been a long-time headache for credit card companies. With the growth of online business around the world, the number of credit card frauds has also increased drastically. A criminal can either steal the plastic credit card to use offline or just obtain the credit number to use it online. Losses from credit card frauds are higher than mobile communication frauds since the former usually involve large amount of transactions. Like mobile communication frauds, credit card frauds are also not easy to trace. 2.3 Intrusions in computer systems Intrusion detection plays a vital role in today s networked environment. Intrusions into computer systems include unauthorized users penetrating the computer systems and authorized users abusing their privileges. Intrusion into computer systems is the most epidemic type of fraud since it is easy to commit. Furthermore, it is very difficult to trace 3

4 the intruders because they may hide in any corner of the world so long as they have the Internet connection. 3. Fraud Detection Algorithms The concept of fraud detection has been founded on data mining techniques such as classification and association rules. Research on fraud detection has been focused on the pattern matching in which abnormal patterns are identified from the normality. We focus on the Detector Constructor framework called DC-1 proposed by Fawcett and Provost (1997) for telephone calls fraud detection and Intrusion Detection framework proposed by Lee, Stolfo, and Mok(1998) 3.1 Detector Constructor Systems (DC-1) (Fawcett and Provost, 1997) Fawcett and Provost s (1997) approach is focused on individual accounts sensitivity, profile, and the aggregation of them to obtain better predictive power. They apply their approach to the account history of cellular calls. The Detector Constructor framework (hereafter DC-1) starts with analyzing available call records including defrauded calls. (1) Classification Rule Learning First, based on the given history of an account, calls of an account are analyzed and labeled as fraudulent calls and legitimate (non-fraudulent) calls. The local set of rules for the account is searched. For example, for one specific account, the following classification rule is devised (Time-of-Day = Night) AND (Location = Bronx) Fraud with certainty factor = 0.89 The certainty factor is defined as a simple frequency-based probability estimate. This rule means that a call is made at night from the Bronx can be considered fraudulent with 89% of the probability. 4

5 However, it is required to have a set of rules, a priori rules, that can perform as fraud indicators, since the rules generated are specific to one single account. In order to generate rules that can apply to as many accounts as possible, they devise an algorithm controlled by two parameters such as T rules and T accts. T rules is defined as a threshold on the number of rules required to cover each account, and T accts is defined as the number of accounts which a rule must have been found in to be selected at all. After an account is examined with a certain number of rules and a rule is applied to a certain number of accounts, a rule is selected. The list of rules generated from each account is reviewed. Finally, the rule that appears the most frequently from the list of the entire account set is chosen. [Refer to Appendix I for their algorithm for rule selection for DC-1]. (2) Construction of Profiling Monitors After rules are selected, a set of monitors are built. The purpose of profiling monitors is to investigate the sensitivities of accounts to general rules. The construction of profiling monitors consists of two stages, a profiling stage and a usage stage. In the profiling stage, a general rule is applied to a portion of an account s legitimate usage to evaluate the account s normal activities. In other words, legitimate activities of an account are summarized into profiling monitors through the use of templates. The statistics of the account s normal activities is saved to that account. Later, in the usage stage, the monitor is applied to the whole part of the account (i.e. account-day). The resulting statistics can be used to examine the abnormality of the usage of the account per day. During this process, the profiling monitors are built by the monitor constructor, which is a set of templates. These templates examine the conditions of the rules. Based on the result of it, each rule-template is finally derived as a profiling monitor. For example, templates are made up with various statistical expressions such as a threshold monitor and a standard deviation monitor. In the threshold monitor, binary categorizations are made according to whether the user s behavior of a day exceeds the threshold defined with the portion of a day. Also, in the standard deviation monitor, different output values 5

6 are defined according to how much the user s behavior of a day deviates from the rule s condition defined with the portion of the day. (3) Combination of Evidence from the Monitors To improve the confidence of the detection, monitors are combined with evidence resulted from the application of monitors to the sample data. For example, monitors generated are applied to a sample account-day, and their outputs, whether fraudulent activities are detected or not, are expressed as a result vector for that day. The evidence about the account-day, whether the account day truly has frauds or not, is introduced together with the outputs. Then, the outputs are weighted with the combination of evidence. Also, the combination of evidence is trained with the threshold value based on the sum of weights. Hence, it is possible to put more confidence on monitors with larger weights to prevent false alarms. After all, there may exist redundant and ineffective rules. To reduce the number of monitors, they propose the use of a sequential forward selection process. Finally, fraud detectors are selected from monitors combined with evidence. 3.2 Intrusion Detection Framework (Lee, Stolfo, and Mok, 1998) Lee, Stolfo, and Mok (1998) design an intrusion detection with the use of data mining techniques. Intrusion detection techniques are largely categorized into two types such as anomaly detection and misuse detection. In the anomaly detection technique, the task is focused on extracting normal (non-fraudulent) usage patterns and finding out deviation from them. On the other hand, in the misuse detection technique, the patterns of previous intrusions and the vulnerable spots of a system are captured based on the historical audit data. Then, an intrusion trial is compared with these identified previous patterns. Their intrusion detection framework starts from the point that there may be a series of access failures to a system that resulted from intrusion trials recorded in the network traffic audit data. Therefore, it is possible to detect intrusions (fraudulent behaviors) by using classification and association rules added with episode analysis. 6

7 (1) Association Rules and Frequent Episode Rules Their framework starts with an expression of an association rule, X Y [c, s]. X and Y are item sets (subsets of attributes in the entire data set. Attributes are columns of a data set). s is support (X Y) of the rule and c is the confidence (support(x Y)/support(X)). The association rule is based on the idea of a priori (Agrawal and Srikant, 1994) in which item sets with length 1 are continuously summed up while joint item sets containing unfrequent subsets are pruned out. If the value of support of an item set is greater than given threshold value, the item set is understood to occur frequently. For example, trn rec.humor, 0.3, 0.1 means that trn comprises 10% of an user s activities, and when a user invokes trn, 30% of the time the user reads the rec.humor file. To consider a sequential characteristic of events, they use the concept of frequent episodes based on minimal occurrences that was devised by Mannila and Toivonen (1996). The frequent episode rule is represented as X, Y Z [c, s, window]. This expression means an episode in which X precedes Y, and Y precedes Z. The episode happens with confidence and support values given. Each event has the width (interval) that is less than the value of window. (2) Introduction of the Axis Attributes and Reference Attributes To prevent meaningless patterns from being generated, they devise the concept of axis attributes and reference attributes. Axis attributes express essential attributes for the construction of association patterns. Therefore, an item set must have these axis attributes to generate a meaningful association pattern. For example, if the service that computer system connections provide is important, the attribute of the service becomes an axis attribute. Then, the association pattern can be expressed as, [Refer to Table 1 of Appendix II] (service = smtp, src_bytes = 200, dst_bytes = 300, flag = SF), 7

8 (service = telnet, flag = SF) (service = http, src_bytes = 200), [0.2, 0.1, 2s] In addition, they devise the concept of reference attributes. They find out that there are some patterns in intrusion trials in which an attribute can play a role of subject. And, some action attributes refer to the subject attribute. For example, it is possible to see the sequence of /images, /images and /shuttle/missions/sts-71 is requested by the same remote host his.moc.kw. from the web log records. [Refer to Table 2 of Appendix II] Next time when the same sequence of requests is recognized, it is possible to find out whether the new sequence has been requested to the same subject attribute identified. If the new sequence does not have the same referred subject, it is possible to drop this episode from the candidate patterns. By devising axis and reference attributes, and defining a frequent episode algorithm, they state that it is possible to improve the pattern finding. (3) Level-wise Approximating Mining On the other hand, in some cases, a pattern with a low frequency matters. However, if the support threshold gets lowered to capture a less frequent but important pattern, the number of rules may increase. To prevent these undesirable results, they propose a levelwise approximate mining. First, the episodes with high frequency axis attribute values are searched. Second, the episodes that have low frequency axis attribute values are searched by the reduction of the support threshold, while the old high axis attribute values are held. Since the old axis attribute value with high frequency already finds out episodes, only new infrequent attribute values, which are relevant in pattern searching, can be considered for new patterns. For example, assuming the following association and episode rule is generated, (service = smtp, src_bytes = 200), (service = smtp, src_bytes = 200) (service = smtp, dst_bytes = 300), [0.3, 0.3, 2s] 8

9 in the second level-wise rule, the axis attribute value is changed from smtp (frequent one) to http (infrequent one) and the support threshold is decreased to 0.1. (service = smtp, src_bytes = 200), (service = http, src_bytes = 200) (service = smtp, dst_bytes = 300), [0.4, 0.1, 2s] [Refer to appendix III for the algorithm for level-wise approximate mining of frequent episodes] 3.3 Algorithms for Pattern Comparisons (Lee, Stolfo, and Mok. 1999) Their data come from simulated intrusion trials with attacking programs. They realize the fact that the patterns of a normal traffic data set differ from the patterns of simulated intrusion attacks. They suggest that, by iteratively comparing these two different patterns, it is possible to find out the patterns of intrusion attacks clearly. During this process, roles of axis attributes and reference attributes are crucial for the sake of rapid pattern comparisons. Through pattern comparisons, fraud patterns can be generated. (1) Encoding scheme First, after the level-wise approximating mining with the use of axis and reference attributes, a candidate classifier is merged and selected. After we get frequent patterns of normal traffic and intrusion attacks, it is possible to encode each pattern to a series of numbers that are comparable. Once patterns are encoded into numbers, these numbers are compared and the most or least similar numbers (patterns) can be selected according to the purpose. The table of data has n attributes. Then each row can represent an association. Some rows in the table have a full set of attributes. On the other hand, others do not have a full set of attributes, and miss some attributes. First, attributes of one association are ordered in terms of (user-defined) importance such as flag, axis attributes, reference attributes, and 9

10 so on. Second, in cases of missing attributes, the positions for the missing attributes are filled with the null value, 0. Therefore, one association can be expressed as (A 1 = v 1, A 2 = v 2,,, A n = v n ) in the complete and ordered form. For example, associations are encoded as shown in the Table 3. [Refer to Table 3 of Appendix 2] For different values of each attribute different numbers are assigned, and for the missing attribute 0 is assigned. Since the flag attribute that tells whether the association is from normal traffic (SF) or the association is from intrusion attack (SO), it comes first in terms of the importance in encoding. (2) Comparing Two Patterns After finishing encoding of associations, an episode is mapped by combining associations. For example, after encoding X association becomes encoding x = x 1 x 2 x n, Y association becomes encoding y = y 1 y 2 y n, and Z association becomes encoding z = z 1 z 2 z n. When there is an episode X, Y Z, the episode can be expressed as x 1 z 1 y 1 x 2 z 2 y 2 x n z n y n, as one dimension of series of numbers. For example, if an intrusion attack episode is given as (flag = SO, service = http), (flag = SO, service = http) (flag = SO, service = http) [0.93, 0.03, 2], the encoded episode becomes When a normal traffic episode is given as (flag = SF, service = http), (flag = SF, service = icmp_echo) (flag = SF, service = http), the encoded episode becomes Consequently, it is possible to compare two episodes by subtracting two episodes and getting the absolute values of differences from each digit (diff score). For example, for the comparison of two episodes above, the diff score of subtraction, d x1 d z1 d y1 d x2 d z2 d y2,,,d xn d zn d yn, is Based on this method, they give the following method of selecting the pattern: (1)Encode all patterns (2)For each pattern from the 10

11 intrusion dataset, calculate its diff score with each normal pattern; keep the lowest diff score as the intrusion score for this pattern (3)Output all patterns that have non-zero intrusion scores, or a user-specified top percentage of patterns with the highest intrusion scores. 4. Intelligent Agents in Fraud Detection Fraud detection is a non-trivial task in this information explosion age. It is faced with three major challenges. First, fraud detection usually involves a large amount of data. In the United States, there were 25 million cellular phone users in 1997 who made about 30 million calls per day (Abu-Hakima et. al., 1997). And these numbers are estimated to have doubled in the recent three years. In Spain, more than 1.2 millions of Visa card operations take place in a given day, 98% of them being handled on line (Dorronsoro et al., 2001). Detecting frauds in such high volume of data is worse than finding a needle in a haystack. It is easy to differentiate a needle from hay but it is hard to tell fraudulent activities from legitimate ones since they look similar. Second, fraud detection needs to be highly accurate. Although the sum of fraudulent activities is very high, the fraud rate is relatively low compared to the gigantic volume of legitimate operations. For credit card transactions in general, the fraud rate is 0.93%. And it is 1.97% for online credit card transactions. Thus, a good fraud detection mechanism should be good at catching frauds and reducing false alarms as well. In terms of statistics, it should be low in both Type I error and Type II error. On one hand, a low fraud coverage rate can increase losses for service providers. On the other hand, a high false alarm rate can irritate customers and drive them away from the companies business. As a result, no prediction success less than 99.9% is acceptable (Brause and Hepp, 1999). Third, frauds need to be detected fast. The expansion of telecommunication networks, the growth of e-business, and the wide deployment of computer systems have brought convenience to both legitimate users and criminals as well. A criminal can commit many 11

12 frauds with high dollar amounts in a short period time. Moreover, legitimate users and customers will lose their patience if they wait too long for fraud check in an operation or transaction. Thus, we need to detect frauds in a very short period of time. Otherwise, the damage costs will be high and the business will lose customers. It is hard for traditional fraud detection methods to satisfy these requirements. For example, the traditional data mining method for fraud detection requires all data reside in the computer s main memory. It is impossible to do so if a huge volume of data is involved. Besides, traditional fraud detection methods suffer from either low error coverage rate or high false alarm rate. Fraud detection methods can be circumvented. If the mobile phone service provider requires Personal Identification Number before a call is made, the criminal can clone that number. If a computer administrator deploys a firewall to block illicit computer uses, an intruder can figure out the configuration weakness within the firewall and bypass it. Many fraud models constructed with traditional statistical methods can generate numerous false alarms. And, if criminals change their patterns, the fraud models can be rendered useless. Moreover, traditional fraud detection methods are usually slow because they require a lot of human intervention. Traditional data mining requires a person to sample a data set, analyze it, establish fraud models, and eventually apply the model for fraud detection. And if a new fraud pattern emerges, the fraud detector needs to repeat the process again. This process needs to be executed offline and it usually takes a long time. Intelligent agents can overcome these obstacles in fraud detection. First, since a data set is handled by a number of agents, each agent only needs to deal with one small piece of data set. And if these agents are deployed on different computers, the piece of data set can reside in the main memories of these computers. Second, intelligent agents do not stick to one rule or model to detect frauds. They are able to derive new rules or models if they receive new inputs. In addition, multiple rules or models can be taught to intelligent agents to ensure the optimal fraud detection. In a word, intelligent agents are intelligent enough to defeat those sophisticated criminals. Last but not least, intelligent agents can 12

13 rapidly detect frauds through cooperation. And they can be placed online to detect frauds in real time. In this paper, we discuss three major types of intelligent agents for fraud detection. The first type is a classification learning multi-agent system, the second type is Java agents for meta learning (JAM), and the third type is artificial neural network agents. 4.1 Classification Learning Multi-agent System Classification learning agents, or rule-based agents, have been extensively studied by many researchers. This paper examines a classification learning multi-agent system specifically designed to detect mobile communication frauds. This system was proposed by Abu-Hakima et al in It consists of three types of agents, namely the Personal Communication Agents (PCAs), the Mobility Network Agents (MNAs), and the Fraud Breaking Agents (FBAs). Personal Communication Agents An important function of PCAs is to set up a user profile. It is possible for PCA to monitor and log all the outgoing telephone numbers, calling time and duration, and receivers information. After PCAs have gathered the information, they put the information in a user profile database. These pieces of information are compared and analyzed with the users previous calling history. The user s calling pattern can be generated using the DC-1 algorithm as we describe before. For example, one of the user s calling patterns may be that the user makes business calls from 9:00 AM to 5:00 PM and personal calls from 5:00 PM to 11:30 PM. Therefore, if the user makes a business call after 11:30 PM, it is very possible that this call is fraudulent. The PCAs can compare the latest user communication with its historical information stored in the user profile database. If it finds that it is an atypical call, it will try to inform the user by another means of communication such as a pager, or a regular phone. If the user can not be reached, the on-going phone call will be switched off by PCAs. 13

14 Mobility Network Agents The MNAs, which are expected to reside in the mobile switching center, can interact with the mobile service subscriber s PCAs. MNAs interact with PCAs to create a better user profile. MNAs can provide billing information about a user to PCAs on a continuous basis. From this information the PCAs can update its user profile database. If an MNA detects a suspicious call, it will alert the PCA. Then the PCA will either alert the user about the suspicious call or monitor the call information for additional evidence to prove that the call is a fraud. Fraud Break Agents Equipped with a single or multiple classification algorithms such as DC-1, Bayes, Ripper and CART, a Fraud Break Agent is specialized in detecting fraudulent calling patterns. Those patterns include long-time international calls, simultaneous calls originated from one cell phone, calls to known criminal centers or suspicious regions. FBAs also reside in the mobile switching center. And based on FBA information, the MNA alerts the PCA to check the user profile for any matching numbers and characteristics for the suspicious calls. We can see that each of PCAs, MNAs and FBAs provides an additional level of protection against fraudulent calls. They interact with each other using various algorithms and check different databases to ensure a low number of false alarms. It would be a very good fraud detection system if it is deployed in the real world. 4.2 Java Agents for Meta Learning The JAM system is a distributed, scalable and portable agent-based data mining system developed by a Columbia University based research group. A JAM system consists two levels of agents: the base level agents and a higher level agent. In order to detect frauds, the JAM system need to compute a fraud detector to judge whether a transaction or an operation is a fraud or not. In JAM this fraud detector is called a classifier. JAM utilizes a machine learning process called meta-learning to compute the classifier. (Chan and Stolfo 14

15 1993). In the meta-learning process, a training data set is divided into several small subsets and distributed to each base agent. Then each base agent computes a base classifier which is a model underlying the data subset, using one of the Bayes, C4.5, CART, ID3, Ripper algorithms or the intrusion detection framework as we mention before. Next, all the base classifiers are delivered to the higher level agent. Each individual base classifier is tested for prediction accuracy against a separate subset of the training data, called a validation set. Through these tests, the higher level agent learns the characteristics and performance of the base classifier. Then, it integrates these independently computed base classifiers into one higher level classifier, called a meta classifier, by using again one of the Bayes, C4.5, CART, ID3, Ripper algorithms or the intrusion detection framework. The meta classifier is the model of the global data set. The JAM system can use this meta classifier to detect frauds. The JAM system has several advantages over the traditional data mining method for fraud detection. The computation of base classifier is a distributed process. Therefore, the base agents can be placed separately in different locations to deal with different data sets. This has two meanings in terms of fraud detection in credit card transactions. First, each bank has usually its own confidential data set and established fraud detection mechanism. It would be better if all the data sets and existent fraud detection algorithms are shared between them. However, they normally would not like to exchange the confidential data sets and information with others. Therefore, if we place the base agents of JAM in the each bank s data set rather than having a centralized data set, it can fully leverage the existing collective wisdom of fraud detection in different banks without breaching their confidentiality requirement. The second beauty is that each base agent only needs to care about one data subset as opposed to a huge centralized data set. This reduces the agent s workload for a large measure. Furthermore, the machine learning process in a JAM system is a two level one. Compared with most other data mining methods, this can lead to a better fraud detector. The fraud model obtained will be globally optimal rather than locally optimal. The model is improved through the meta learning process. 15

16 Owing to these desirable features, the JAM system shows good results in detecting credit card frauds and computer intrusions when tested in lab. 4.3 Artificial Neural Network Lippman (1987) defines the Artificial Neural Network as a statistical information processing mechanisms composed of numerous distributed processing unit or nodes that perform simultaneous computations and communicate using adaptable interconnections called weights. Artificial Neural Network (hereafter ANN) consists of nodes residing in three layers including input, output, and hidden layers. Although the input and output nodes are determined by the user according to purposes, the hidden nodes serving as connectors between input and output layers are established by the network itself through training. Desouza (2001) defines that, in general, the processes of ANN comprise three stages such as training, testing, and deployment. In the training process, different weights are assigned nodes and layers by different training algorithms using the past data. Then, by using testing data extracted from the past data, it is possible to evaluate whether ANN can operate as desired. This combination of training and testing will be performed repeatedly until the model is obtained. There are two types of training methods: supervised and unsupervised methods. In the supervised training method, both the input and the desired result are provided. And the output is compared with the desired data until the predetermined accuracy is obtained by changing different links and weights assigned to ANN. However, in the unsupervised training method, only input data are given and human users do not compare output data with the desired results. Seymour (2000) states that ANN may be the best solutions in the situations where rule selection is difficult in terms of speed and complexity. He argues that ANN is preferred in two main reasons. First, the arithmetic characteristics of ANN make ANN good at handling large volumes of data. ANN put more focus on the pattern identification rather 16

17 than data analysis. Second, ANN can keep alternating its weights among the links with the accumulated data during the training. Therefore, it can be easily and quickly adapted into input changes. This is the main reason that ANN is considered one of the ideal applications for fraud detection that deal with the large amount of data. 5. Applications of Intelligent Agents in Fraud Detection Agent-based applications in general, and neural network agent systems in particular, have already been used in fraud detection. Furthermore, they have great potentials for wide adoption in the future. 5.1 Applications of JAM JAM has been applied in a lab test environment to detect frauds in credit card transactions and computer intrusions. For the credit card fraud test, the JAM research team used data sets provided by Chase and First Union. The two data sets, which are developed over years by experienced bank personnel for fraud detection, share a number of common properties. These properties include a hashed credit card number, scores produced by a commercial authorization/detection system, the date and time of the transaction, past payment information of the card user, the amount of transaction, and so on. Each of the two bank data sets also contain some important proprietary properties (PF). This causes data schema integration problem, which can be solved by two methods. One method is to learn a local model using PF information, later exchange the PF information between the two data sets, and compute a new local model. Another way is to learn a model using PF information and hold it locally without exchange. They sampled 84,000 records from 500,000 records from the two data sets for the learning process. The purpose of learning is to identify fraudulent characteristics in the 30 attribute fields to establish a fraud model. They applied four types of algorithms, including Bayes, C4.5, CART, ID3 and Ripper, to both the base classifier and meta- 17

18 classifier learning processes. The results indicate that Ripper and CART could produce the best base classifiers, and Bayes could generate the best and most stable metaclassifier. Ripper can CART could catch 80% of the fraudulent transactions and give false alarm to 16% of the legitimate transactions. In comparison, Bayes could catch 80% of fraudulent transaction but only cause false alarms to 13% of the transactions. On the opposite end, ID3 is found to be worst algorithm in the overall performance. JAM is also tested to detect intrusions in computer systems. One command called lpt in LINUX operating system can be abused by an intruder to cause a buffer overflow. JAM was applied to find out whether the command is sent by the legitimate user or the intruder. In this context, the agents are trained using the intrusion detection framework which includes axis and reference attributes, level-wise approximating mining, and pattern comparison. The result of the test indicated good performance. 5.2 Application of Neural Network PayPal has successfully implemented its neural network system which brings huge revenues for this company. PayPal is an online person-to-person (P2P) payment company. It allows one user to pay another user through s. To use PayPal service, both payer and payee must register before hand and link their PayPal accounts to either a bank account or a credit card account. To complete a payment, the payer needs to log into his/her PayPal account and tell PayPal the address of the payee and the payment amount. Then, PayPal will transfer the amount of money from the payer s PayPal account to the payee s account. And for this transaction, PayPal will charge a service fee. With PayPal service, a user does not need to register with a credit card company to receive credit card payment from another user. It brings a lot of convenience to individual or small online business users who need to make online P2P payment transactions. However, this type of online P2P payment system suffers from many illegal transactions. Since customers are usually not liable for the losses, it is crucial for merchants to stop these frauds. Apart from PayPal, there were several other companies doing similar 18

19 business. However, because of large fraud losses, these companies went out of business one after another. PayPal has survived well primarily because of its fraud detection system named Igor. Igor incorporates both old and new techniques from the field of artificial intelligence. It is a rule-based expert system equipped with neural network technology. Igor knows a series of rules (for example, if the recipient is associated with a known terrorist group, then block the payment.) Also, Igor s pattern detection algorithm can monitor user activities and learn new types of frauds over time. If a user keeps open new PayPal accounts linked with the same set of credit cards, Igor will learn the scam through data mining and watch the user s payment activities more cautiously. Fraud rate with PayPal is around.5%, is much better than average fraud rate of 1.13% for online merchants. 6. Further Research Areas Intelligent agents for fraud detection can be applied to many areas. One of these areas is continuous auditing. Continuous auditing is a promising field which can automate the auditing process and provide audit reports on a continuous basis. However, one weakness of continuous auditing is the possible management fraud problem. Due to the lack of human intervention, management frauds are more likely to occur. A multi-agent system for fraud detection can solve this problem. Agents can be deployed in supply chain partners sites, at the company s general ledger level, and at the company s financial statement level. The agents at the partners sites can monitor the transaction activities. And they can also interact with general ledger level agents to verify the data accuracy. Also, if there is some unusual transaction, these partner site agents can signal an alarm. After a transaction is completed, all the transaction data will be collected by the general ledger level agents and then delivered to the financial statement level agents. After the delivery, the financial statement level agents will summarize the information can create a set of financial 19

20 statements. Then these agents will compare the data in the reports with those in historical financial reports to check the overall reasonableness. If the data are suspicious, the agents at the financial statement level will alert the human auditor. All these agents will be created and deployed by the CPA firms to ensure the auditor s independence. The agents at the general ledger level and the financial statement level should be XBRLcompliant. With the agents aid, analytical procedures, substantive tests of balance, and the tests of details of balances can be performed automatically. The financial data are doublechecked, both with historical data and with partner s information, to prevent management fraud. 7. Conclusion Intelligent agents can play an important role in the fraud detection domain. They are robust enough to defeat sophisticated fraudsters, they are fast enough to minimize fraud damages, and they are scalable enough to tackle huge volumes of data. Intelligent agents will eventually be the ultimate means to fight against frauds. However, there is still a long way to go before the wide adoption of intelligent agents for fraud detection. The accuracy of fraud detection needs to be improved, the reliability of the agents needs to be ensured, and the costs to build and deploy these agents need to be reduced. Besides, at this point, it seems that research on fraud detection in accounting field, especially from the point of view of continuous auditing is not active. Several reasons can be thought. First, unlike intrusion detection of computer network and fraud detection in calling cards, it is much harder to find out particular patterns or episodes from accounting 20

21 data. According to Lee, Stolfo, and Mok (1998:Mining in a Data-flow environment), the real meaning of automated fraud detection has not been researched on yet. In other words, the research on anomaly detection has not seen any solid results yet. Therefore, tools for fraud detection are always getting behind newly-developed fraud schemes, since we have to learn the record of fraud schemes and train the detector. Since it is really hard to tell abnormal activities which are real fraudulent activities from ones that are unusual legitimate activities, the implementation of the anomaly detection seems difficult. If all the participants in the industry can share their historical fraud data and fraud classifiers, the wide adoption of by using intelligent agents can be realized in the near future. 21

22 Appendix I. Rule selection and covering algorithm used by DC 1 Given : Accts: set of all accounts Rules: set of all fraud rules generated from Accts T rules : (parameter) Number of rules required to cover each account T accts : (parameter) Number of accounts in which a rule must have been found Output : S: set of selected rules. 1. /*Initialization*/ 2. S = { }; 3. for (a Accts) do Cover[a] = 0; 4. for (r Rules) do 5. Occur[r] = 0; /*Number of accounts in which r occurs*/ 6. AcctsGen[r] = { }; /*Set of accounts generating r */ 7. end for 8. /* Set up Occur and AcctsGen */ 9. for (a Accts) do 10. R a = set of rules generated from a; 11. for (r R a ) do 12. Occur[r] : = Occur[r] + 1; 13. add a to AcctsGen[r]; 14. end for; end for 15. /* Cover Accts with Rules */ 16. for (a Accts) do 17. R a = list of rules generated from a; 18. sort R a by Occur; 19. while (cover[a] < T rules ) do 20. r := highest-occurrence rule from R a 21. Remove r from R a 22. if (r S and Occur[r] T accts ) then 23. add r to S; 24. for (a 2 AcctsGen[r]) do 25. Cover[a 2 ] = Cover[a 2 ] + 1; 26. end for; end if 27. end while; end for *source : Fawcett and Provost (1997) Adaptive fraud detection 22

23 Appendix II. Table 1. Network Connection Records Time stamp Duration Service Src_bytes Dst_bytes Flg telnet SF ftp SF smtp SF telnet SF smtp SF smtp SF http REJ smtp SF *source : Lee, Mok and Stolfo (2000), Adaptive Intrusion Detection : a Data Mining Approach Table 2. Web Log Records Timestamp Remote host (subject) Action Request (action) 1 his.moc.kw GET /images 1.1 his.moc.kw GET /images 1.3 his.moc.kw GET /shuttle/missions/sts taka10.taka.is.uec.ac.jp GET /images 3.2 taka10.taka.is.uec.ac.jp GET /images 3.5 taka10.taka.is.uex.ac.jp GET /shuttle/missions/sts-71 8 rjenkin.hip.cam.org GET /images 8.2 rjenkin.hip.cam.org GET /images 9 rjenkin.hip.cam.org GET /shuttle/missions/sts-71 *source : Lee, Mok and Stolfo (2000), Adaptive Intrusion Detection : a Data Mining Approach Table 3. Encoding scheme (Encodings of Associations) Association (flag = SF, service = http, src_bytes = 200) (service = icmp_echo, dst_host = host B ) (flag = S0, service = http, src_host = host A ) (service = user_app, src_host = host A ) Encoding (flag = SF, service = icmp_echo, dst_host = host B, src_host = host C *source : Lee, Mok, and Stolfo (1999), Mining in a Data-flow Environment : Experience in Network Intrusion Detection 23

24 Appendix III. Level-wise Approximate Mining of Frequent Episodes Input : the terminating minimum support s 0, the initial minimum support s i, and the axis attribute(s) Output : frequent episode rules Rules Begin (1) R restricted = 0; (2) scan database to form L = {large 1-itemsets that meet s 0 }; (3) s = s i ; (4) while (s s 0 ) do begin (5) find serial episodes from L : each pattern must contain at least one axis attribute value that is not in R restricted ; (6) append new axis attribute values to R restricted ; (7) append episodes rules to the output rule set Rules ; (8) s = s/2; end while end *source : Lee, Mok, and Stolfo (2000), Adaptive Intrusion Detection : a Data Mining Approach 24

25 References: Abu-Hakima, S., Toloo, M., White, T., A Multi-Agent Systems Approach for Fraud Detection in Personal Communication Systems, IJCAI-97 Workshop on Intelligent Adaptive Agents, Portland, Oregon, 1997 Agrawal, R., Srikant, R., Fast Algorithms for Mining Association Rules, In: Proceedings of the 20 th VLDB conference, Santiago, Chile, 1994 Brause, R., Langsdorf, T., Hepp, M., Neural Data Mining for Credit Card Fraud Detection, Working paper, J.W. Goethe University, Comp. Sc. Dep. Report, Frankfurt, Germany, 1999 Brause, R., Langsdorf, T., Hepp, M., Credit Card Fraud Detection by Adaptive Neural Data Mining, Internet Bericht, Frankfurt, Germany, 1999 Cannady, J., The Application of Artificial Neural Networks to Misuse Detection: Initial Results. Chan, P., Stolfo, S., Toward Parallel and Distributed Learning by Meta-Learning, In:AAAI Workshop in Knowledge Discovery in Databases, 1993, pp Desouza, K., Modeling The Human Brain: Artificial Neural Networks, 2001 (submitted to Journal of the Information Technology Professional, A publication of the Computer Society, IEEE) Dorronsoro, J., Ginel, F., Sanchez, C., Cruz, C.S., Neural Fraud Detection in Credit Card Operations, Paper Draft, Madrid, Spain, 2001 Fawcett, T., Provost, F., Adaptive Fraud Detection, Data Mining and Knowledge Discovery 1,2, Kluwer Academic Publishers, Boston, Massachusttes, 1997, pp Lee, W. et al., Real Time Data Mining based Intrusion Detection Lee, W., Stolfo, S., and Mok, K., Mining Audit Data to Build Intrusion Detection Models, In:Proceedings of the 4 th International Conference on Knowledge Discovery and Data Mining, New York, NY, Lee, W., Stolfo, S., Mok, K., Mining in a Data-flow Environment:Experience in Network Intrusion Detection, In:Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-99), 1999 Lee, W., Stolfo, S., and Mok, K., Adaptive Intrusion Detection : a Data Mining Approach, Kluwer Academic Publishers,

26 Mannila, H., Toivonen, H., Discovering Generalized Episodes Using Minimal Occurences, In:Proceedings of the 2 nd International Conference on Knowledge Discovery in Databases and Data Mining, Portland, Oregon, 1996 Patrick, B., Choi, J.H., Assessing the Risk of Management Fraud Through Neural Network Technology, Auditing: A Journal of Practice & Theory, Vol. 16, No.1, 1997 Prodromidis, A., Stolfo, S., Mining Databases with Different Schemas: Integrating Incompatible Classifiers, In:Proceedings of Fourth International Conference of Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, CA, 1998, pp Prodromidis, A.L., Stolfo, S., Agent-Based Distributed Learning Applied to Fraud Detection, CUCS working paper, New York, NY, 1999 Seymour, B., How Neural Network Technology Can Tackle the Growing Telecom Fraud Problem, Information Security Bulletin, April, 2000, pp Steward, S., Lighting the way in 97, Cellular Business, 23, January, 1997 Stolfo, S., Prodromidis, A., Chan, P.K., JAM: Java Agents for Meta-learning over Distributed Databases, In:Proceedings of Second International Workshop Multistrategy Learning, Center for Artificial Intelligence, George Mason University, Fairfax, VA, 1993 Walters, D., Wilkinson, W., Wireless fraud, now and in the future: A view of the problem and some solutions, Mobile Phone News, October,

HYBRID INTRUSION DETECTION FOR CLUSTER BASED WIRELESS SENSOR NETWORK

HYBRID INTRUSION DETECTION FOR CLUSTER BASED WIRELESS SENSOR NETWORK HYBRID INTRUSION DETECTION FOR CLUSTER BASED WIRELESS SENSOR NETWORK 1 K.RANJITH SINGH 1 Dept. of Computer Science, Periyar University, TamilNadu, India 2 T.HEMA 2 Dept. of Computer Science, Periyar University,

More information

Credit Card Fraud Detection Using Meta-Learning: Issues 1 and Initial Results

Credit Card Fraud Detection Using Meta-Learning: Issues 1 and Initial Results From: AAAI Technical Report WS-97-07. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Credit Card Fraud Detection Using Meta-Learning: Issues 1 and Initial Results Salvatore 2 J.

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 21 CHAPTER 1 INTRODUCTION 1.1 PREAMBLE Wireless ad-hoc network is an autonomous system of wireless nodes connected by wireless links. Wireless ad-hoc network provides a communication over the shared wireless

More information

Intrusion Detection via Machine Learning for SCADA System Protection

Intrusion Detection via Machine Learning for SCADA System Protection Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. [email protected] J. Jiang Department

More information

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

A Review of Anomaly Detection Techniques in Network Intrusion Detection System A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In

More information

A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model

A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model ABSTRACT Mrs. Arpana Bharani* Mrs. Mohini Rao** Consumer credit is one of the necessary processes but lending bears

More information

Intrusion Detection System using Log Files and Reinforcement Learning

Intrusion Detection System using Log Files and Reinforcement Learning Intrusion Detection System using Log Files and Reinforcement Learning Bhagyashree Deokar, Ambarish Hazarnis Department of Computer Engineering K. J. Somaiya College of Engineering, Mumbai, India ABSTRACT

More information

Performance Evaluation of Intrusion Detection Systems

Performance Evaluation of Intrusion Detection Systems Performance Evaluation of Intrusion Detection Systems Waleed Farag & Sanwar Ali Department of Computer Science at Indiana University of Pennsylvania ABIT 2006 Outline Introduction: Intrusion Detection

More information

APPLICATION OF MULTI-AGENT SYSTEMS FOR NETWORK AND INFORMATION PROTECTION

APPLICATION OF MULTI-AGENT SYSTEMS FOR NETWORK AND INFORMATION PROTECTION 18-19 September 2014, BULGARIA 137 Proceedings of the International Conference on Information Technologies (InfoTech-2014) 18-19 September 2014, Bulgaria APPLICATION OF MULTI-AGENT SYSTEMS FOR NETWORK

More information

Outline Intrusion Detection CS 239 Security for Networks and System Software June 3, 2002

Outline Intrusion Detection CS 239 Security for Networks and System Software June 3, 2002 Outline Intrusion Detection CS 239 Security for Networks and System Software June 3, 2002 Introduction Characteristics of intrusion detection systems Some sample intrusion detection systems Page 1 Page

More information

Credit Card Fraud Detection Using Self Organised Map

Credit Card Fraud Detection Using Self Organised Map International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1343-1348 International Research Publications House http://www. irphouse.com Credit Card Fraud

More information

Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1

Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1 Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1 Salvatore J. Stolfo, David W. Fan, Wenke Lee and Andreas L. Prodromidis Department of Computer Science Columbia University

More information

Electronic Payment Fraud Detection Techniques

Electronic Payment Fraud Detection Techniques World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741 Vol. 2, No. 4, 137-141, 2012 Electronic Payment Fraud Detection Techniques Adnan M. Al-Khatib CIS Dept. Faculty of Information

More information

The Credit Card Fraud Detection Analysis With Neural Network Methods

The Credit Card Fraud Detection Analysis With Neural Network Methods The Credit Card Fraud Detection Analysis With Neural Network Methods 1 M.Jeevana Sujitha, 2 K. Rajini Kumari, 3 N.Anuragamayi 1,2,3 Dept. of CSE, A.S.R College of Engineering & Tech., Tetali, Tanuku, AP,

More information

Using Artificial Intelligence in Intrusion Detection Systems

Using Artificial Intelligence in Intrusion Detection Systems Using Artificial Intelligence in Intrusion Detection Systems Matti Manninen Helsinki University of Technology [email protected] Abstract Artificial Intelligence could make the use of Intrusion Detection

More information

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators

More information

DATA MINING APPLICATION IN CREDIT CARD FRAUD DETECTION SYSTEM

DATA MINING APPLICATION IN CREDIT CARD FRAUD DETECTION SYSTEM Journal of Engineering Science and Technology Vol. 6, No. 3 (2011) 311-322 School of Engineering, Taylor s University DATA MINING APPLICATION IN CREDIT CARD FRAUD DETECTION SYSTEM FRANCISCA NONYELUM OGWUELEKA

More information

SHARE THIS WHITEPAPER. Top Selection Criteria for an Anti-DDoS Solution Whitepaper

SHARE THIS WHITEPAPER. Top Selection Criteria for an Anti-DDoS Solution Whitepaper SHARE THIS WHITEPAPER Top Selection Criteria for an Anti-DDoS Solution Whitepaper Table of Contents Top Selection Criteria for an Anti-DDoS Solution...3 DDoS Attack Coverage...3 Mitigation Technology...4

More information

To improve the problems mentioned above, Chen et al. [2-5] proposed and employed a novel type of approach, i.e., PA, to prevent fraud.

To improve the problems mentioned above, Chen et al. [2-5] proposed and employed a novel type of approach, i.e., PA, to prevent fraud. Proceedings of the 5th WSEAS Int. Conference on Information Security and Privacy, Venice, Italy, November 20-22, 2006 46 Back Propagation Networks for Credit Card Fraud Prediction Using Stratified Personalized

More information

Applying machine learning techniques to achieve resilient, accurate, high-speed malware detection

Applying machine learning techniques to achieve resilient, accurate, high-speed malware detection White Paper: Applying machine learning techniques to achieve resilient, accurate, high-speed malware detection Prepared by: Northrop Grumman Corporation Information Systems Sector Cyber Solutions Division

More information

Radware s Behavioral Server Cracking Protection

Radware s Behavioral Server Cracking Protection Radware s Behavioral Server Cracking Protection A DefensePro Whitepaper By Renaud Bidou Senior Security Specialist,Radware October 2007 www.radware.com Page - 2 - Table of Contents Abstract...3 Information

More information

Combining Data Mining and Machine Learning for Effective Fraud Detection*

Combining Data Mining and Machine Learning for Effective Fraud Detection* From: AAAI Technical Report WS-97-07. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Combining Data Mining and Machine Learning for Effective Fraud Detection* Tom Fawcett NYNEX Science

More information

E-Banking Integrated Data Utilization Platform WINBANK Case Study

E-Banking Integrated Data Utilization Platform WINBANK Case Study E-Banking Integrated Data Utilization Platform WINBANK Case Study Vasilis Aggelis Senior Business Analyst, PIRAEUSBANK SA, [email protected] Abstract we all are living in information society. Companies

More information

Data Mining Approach in Security Information and Event Management

Data Mining Approach in Security Information and Event Management Data Mining Approach in Security Information and Event Management Anita Rajendra Zope, Amarsinh Vidhate, and Naresh Harale Abstract This paper gives an overview of data mining field & security information

More information

Observation and Findings

Observation and Findings Chapter 6 Observation and Findings 6.1. Introduction This chapter discuss in detail about observation and findings based on survey performed. This research work is carried out in order to find out network

More information

THE ROLE OF IDS & ADS IN NETWORK SECURITY

THE ROLE OF IDS & ADS IN NETWORK SECURITY THE ROLE OF IDS & ADS IN NETWORK SECURITY The Role of IDS & ADS in Network Security When it comes to security, most networks today are like an egg: hard on the outside, gooey in the middle. Once a hacker

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

NTT DATA Big Data Reference Architecture Ver. 1.0

NTT DATA Big Data Reference Architecture Ver. 1.0 NTT DATA Big Data Reference Architecture Ver. 1.0 Big Data Reference Architecture is a joint work of NTT DATA and EVERIS SPAIN, S.L.U. Table of Contents Chap.1 Advance of Big Data Utilization... 2 Chap.2

More information

Application of Data Mining Techniques in Intrusion Detection

Application of Data Mining Techniques in Intrusion Detection Application of Data Mining Techniques in Intrusion Detection LI Min An Yang Institute of Technology [email protected] Abstract: The article introduced the importance of intrusion detection, as well as

More information

Introduction... Error! Bookmark not defined. Intrusion detection & prevention principles... Error! Bookmark not defined.

Introduction... Error! Bookmark not defined. Intrusion detection & prevention principles... Error! Bookmark not defined. Contents Introduction... Error! Bookmark not defined. Intrusion detection & prevention principles... Error! Bookmark not defined. Technical OverView... Error! Bookmark not defined. Network Intrusion Detection

More information

Intrusion Detection Systems. Overview. Evolution of IDSs. Oussama El-Rawas. History and Concepts of IDSs

Intrusion Detection Systems. Overview. Evolution of IDSs. Oussama El-Rawas. History and Concepts of IDSs Intrusion Detection Systems Oussama El-Rawas History and Concepts of IDSs Overview A brief description about the history of Intrusion Detection Systems An introduction to Intrusion Detection Systems including:

More information

Advancement in Virtualization Based Intrusion Detection System in Cloud Environment

Advancement in Virtualization Based Intrusion Detection System in Cloud Environment Advancement in Virtualization Based Intrusion Detection System in Cloud Environment Jaimin K. Khatri IT Systems and Network Security GTU PG School, Ahmedabad, Gujarat, India Mr. Girish Khilari Senior Consultant,

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Data Mining For Intrusion Detection Systems. Monique Wooten. Professor Robila

Data Mining For Intrusion Detection Systems. Monique Wooten. Professor Robila Data Mining For Intrusion Detection Systems Monique Wooten Professor Robila December 15, 2008 Wooten 2 ABSTRACT The paper discusses the use of data mining techniques applied to intrusion detection systems.

More information

An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks

An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks 2011 International Conference on Network and Electronics Engineering IPCSIT vol.11 (2011) (2011) IACSIT Press, Singapore An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks Reyhaneh

More information

On A Network Forensics Model For Information Security

On A Network Forensics Model For Information Security On A Network Forensics Model For Information Security Ren Wei School of Information, Zhongnan University of Economics and Law, Wuhan, 430064 [email protected] Abstract: The employment of a patchwork

More information

Data Privacy: The High Cost of Unprotected Sensitive Data 6 Step Data Privacy Protection Plan

Data Privacy: The High Cost of Unprotected Sensitive Data 6 Step Data Privacy Protection Plan WHITE PAPER Data Privacy: The High Cost of Unprotected Sensitive Data 6 Step Data Privacy Protection Plan Introduction to Data Privacy Today, organizations face a heightened threat landscape with data

More information

Web Application Security

Web Application Security Web Application Security Richard A. Kemmerer Reliable Software Group Computer Science Department University of California Santa Barbara, CA 93106, USA http://www.cs.ucsb.edu/~rsg www.cs.ucsb.edu/~rsg/

More information

Intrusion Detection for Grid and Cloud Computing

Intrusion Detection for Grid and Cloud Computing Intrusion Detection for Grid and Cloud Computing Author Kleber Vieira, Alexandre Schulter, Carlos Becker Westphall, and Carla Merkle Westphall Federal University of Santa Catarina, Brazil Content Type

More information

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand

More information

Name. Description. Rationale

Name. Description. Rationale Complliiance Componentt Description DEEFFI INITION Network-Based Intrusion Detection Systems (NIDS) Network-Based Intrusion Detection Systems (NIDS) detect attacks by capturing and analyzing network traffic.

More information

KEITH LEHNERT AND ERIC FRIEDRICH

KEITH LEHNERT AND ERIC FRIEDRICH MACHINE LEARNING CLASSIFICATION OF MALICIOUS NETWORK TRAFFIC KEITH LEHNERT AND ERIC FRIEDRICH 1. Introduction 1.1. Intrusion Detection Systems. In our society, information systems are everywhere. They

More information

Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks

Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks R.RAJAMANI Assistant Professor, Department of Computer Science, PSG College of Arts & Science, Coimbatore. Email: [email protected]

More information

Building A Smart Academic Advising System Using Association Rule Mining

Building A Smart Academic Advising System Using Association Rule Mining Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 [email protected] Qutaibah Althebyan +962796536277 [email protected] Baraq Ghalib & Mohammed

More information

How To Detect Denial Of Service Attack On A Network With A Network Traffic Characterization Scheme

How To Detect Denial Of Service Attack On A Network With A Network Traffic Characterization Scheme Efficient Detection for DOS Attacks by Multivariate Correlation Analysis and Trace Back Method for Prevention Thivya. T 1, Karthika.M 2 Student, Department of computer science and engineering, Dhanalakshmi

More information

Enterprise Organizations Need Contextual- security Analytics Date: October 2014 Author: Jon Oltsik, Senior Principal Analyst

Enterprise Organizations Need Contextual- security Analytics Date: October 2014 Author: Jon Oltsik, Senior Principal Analyst ESG Brief Enterprise Organizations Need Contextual- security Analytics Date: October 2014 Author: Jon Oltsik, Senior Principal Analyst Abstract: Large organizations have spent millions of dollars on security

More information

Fuzzy Network Profiling for Intrusion Detection

Fuzzy Network Profiling for Intrusion Detection Fuzzy Network Profiling for Intrusion Detection John E. Dickerson ([email protected]) and Julie A. Dickerson ([email protected]) Electrical and Computer Engineering Department Iowa State University

More information

End-user Security Analytics Strengthens Protection with ArcSight

End-user Security Analytics Strengthens Protection with ArcSight Case Study for XY Bank End-user Security Analytics Strengthens Protection with ArcSight INTRODUCTION Detect and respond to advanced persistent threats (APT) in real-time with Nexthink End-user Security

More information

SANS Top 20 Critical Controls for Effective Cyber Defense

SANS Top 20 Critical Controls for Effective Cyber Defense WHITEPAPER SANS Top 20 Critical Controls for Cyber Defense SANS Top 20 Critical Controls for Effective Cyber Defense JANUARY 2014 SANS Top 20 Critical Controls for Effective Cyber Defense Summary In a

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia

More information

USING LOCAL NETWORK AUDIT SENSORS AS DATA SOURCES FOR INTRUSION DETECTION. Integrated Information Systems Group, Ruhr University Bochum, Germany

USING LOCAL NETWORK AUDIT SENSORS AS DATA SOURCES FOR INTRUSION DETECTION. Integrated Information Systems Group, Ruhr University Bochum, Germany USING LOCAL NETWORK AUDIT SENSORS AS DATA SOURCES FOR INTRUSION DETECTION Daniel Hamburg,1 York Tüchelmann Integrated Information Systems Group, Ruhr University Bochum, Germany Abstract: The increase of

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Data Mining Application for Cyber Credit-card Fraud Detection System

Data Mining Application for Cyber Credit-card Fraud Detection System , July 3-5, 2013, London, U.K. Data Mining Application for Cyber Credit-card Fraud Detection System John Akhilomen Abstract: Since the evolution of the internet, many small and large companies have moved

More information

A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING

A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING AZRUDDIN AHMAD, GOBITHASAN RUDRUSAMY, RAHMAT BUDIARTO, AZMAN SAMSUDIN, SURESRAWAN RAMADASS. Network Research Group School of

More information

The Cyber Threat Profiler

The Cyber Threat Profiler Whitepaper The Cyber Threat Profiler Good Intelligence is essential to efficient system protection INTRODUCTION As the world becomes more dependent on cyber connectivity, the volume of cyber attacks are

More information

Information Technology Security Review April 16, 2012

Information Technology Security Review April 16, 2012 Information Technology Security Review April 16, 2012 The Office of the City Auditor conducted this project in accordance with the International Standards for the Professional Practice of Internal Auditing

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

Fighting Future Fraud A Strategy for Using Big Data, Machine Learning, and Data Lakes to Fight Mobile Communications Fraud

Fighting Future Fraud A Strategy for Using Big Data, Machine Learning, and Data Lakes to Fight Mobile Communications Fraud Fighting Future Fraud A Strategy for Using Big Data, Machine Learning, and Data Lakes to Fight Mobile Communications Fraud Authored by: Dr. Ian Howells Dr. Volkmar Scharf-Katz Padraig Stapleton 1 TABLE

More information

A strategic approach to fraud

A strategic approach to fraud A strategic approach to fraud A continuous cycle of fraud risk management The risk of fraud is rising at an unprecedented rate. Today s tough economic climate is driving a surge in first party fraud for

More information

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Impact of Feature Selection on the Performance of ireless Intrusion Detection Systems

More information

The New Reality of Synthetic ID Fraud How to Battle the Leading Identity Fraud Tactic in The Digital Age

The New Reality of Synthetic ID Fraud How to Battle the Leading Identity Fraud Tactic in The Digital Age How to Battle the Leading Identity Fraud Tactic in The Digital Age In the 15 years since synthetic identity fraud emerged as a significant threat, it has become the predominant tactic for fraudsters. The

More information

Overcoming Five Critical Cybersecurity Gaps

Overcoming Five Critical Cybersecurity Gaps Overcoming Five Critical Cybersecurity Gaps How Active Threat Protection Addresses the Problems that Security Technology Doesn t Solve An esentire White Paper Copyright 2015 esentire, Inc. All rights reserved.

More information

Why Bayesian filtering is the most effective anti-spam technology

Why Bayesian filtering is the most effective anti-spam technology Why Bayesian filtering is the most effective anti-spam technology Achieving a 98%+ spam detection rate using a mathematical approach This white paper describes how Bayesian filtering works and explains

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION. Abstract

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION. Abstract FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION Susan M. Bridges, Associate Professor Rayford B. Vaughn, Associate Professor Department of Computer Science Mississippi State University

More information

Network- vs. Host-based Intrusion Detection

Network- vs. Host-based Intrusion Detection Network- vs. Host-based Intrusion Detection A Guide to Intrusion Detection Technology 6600 Peachtree-Dunwoody Road 300 Embassy Row Atlanta, GA 30348 Tel: 678.443.6000 Toll-free: 800.776.2362 Fax: 678.443.6477

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

A SYSTEM FOR DENIAL OF SERVICE ATTACK DETECTION BASED ON MULTIVARIATE CORRELATION ANALYSIS

A SYSTEM FOR DENIAL OF SERVICE ATTACK DETECTION BASED ON MULTIVARIATE CORRELATION ANALYSIS Journal homepage: www.mjret.in ISSN:2348-6953 A SYSTEM FOR DENIAL OF SERVICE ATTACK DETECTION BASED ON MULTIVARIATE CORRELATION ANALYSIS P.V.Sawant 1, M.P.Sable 2, P.V.Kore 3, S.R.Bhosale 4 Department

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Credit Card Fraud Detection Using Hidden Markov Model

Credit Card Fraud Detection Using Hidden Markov Model International Journal of Soft Computing and Engineering (IJSCE) Credit Card Fraud Detection Using Hidden Markov Model SHAILESH S. DHOK Abstract The most accepted payment mode is credit card for both online

More information

Intrusion Detection and Cyber Security Monitoring of SCADA and DCS Networks

Intrusion Detection and Cyber Security Monitoring of SCADA and DCS Networks Intrusion Detection and Cyber Security Monitoring of SCADA and DCS Networks Dale Peterson Director, Network Security Practice Digital Bond, Inc. 1580 Sawgrass Corporate Parkway, Suite 130 Sunrise, FL 33323

More information

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

More information

Detection of Distributed Denial of Service Attack with Hadoop on Live Network

Detection of Distributed Denial of Service Attack with Hadoop on Live Network Detection of Distributed Denial of Service Attack with Hadoop on Live Network Suchita Korad 1, Shubhada Kadam 2, Prajakta Deore 3, Madhuri Jadhav 4, Prof.Rahul Patil 5 Students, Dept. of Computer, PCCOE,

More information

Intrusion Detection System in Campus Network: SNORT the most powerful Open Source Network Security Tool

Intrusion Detection System in Campus Network: SNORT the most powerful Open Source Network Security Tool Intrusion Detection System in Campus Network: SNORT the most powerful Open Source Network Security Tool Mukta Garg Assistant Professor, Advanced Educational Institutions, Palwal Abstract Today s society

More information

Adaptive Fraud Detection

Adaptive Fraud Detection Data Mining and Knowledge Discovery 1, 291 316 (1997) c 1997 Kluwer Academic Publishers. Manufactured in The Netherlands. Adaptive Fraud Detection TOM FAWCETT [email protected] FOSTER PROVOST [email protected]

More information

Taxonomy of Intrusion Detection System

Taxonomy of Intrusion Detection System Taxonomy of Intrusion Detection System Monika Sharma, Sumit Sharma Abstract During the past years, security of computer networks has become main stream in most of everyone's lives. Nowadays as the use

More information

Network Based Intrusion Detection Using Honey pot Deception

Network Based Intrusion Detection Using Honey pot Deception Network Based Intrusion Detection Using Honey pot Deception Dr.K.V.Kulhalli, S.R.Khot Department of Electronics and Communication Engineering D.Y.Patil College of Engg.& technology, Kolhapur,Maharashtra,India.

More information

Prediction of DDoS Attack Scheme

Prediction of DDoS Attack Scheme Chapter 5 Prediction of DDoS Attack Scheme Distributed denial of service attack can be launched by malicious nodes participating in the attack, exploit the lack of entry point in a wireless network, and

More information

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module No. # 01 Lecture No. # 40 Firewalls and Intrusion

More information

Concept and Project Objectives

Concept and Project Objectives 3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the

More information

Making critical connections: predictive analytics in government

Making critical connections: predictive analytics in government Making critical connections: predictive analytics in government Improve strategic and tactical decision-making Highlights: Support data-driven decisions using IBM SPSS Modeler Reduce fraud, waste and abuse

More information

MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM

MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM J. Arokia Renjit Asst. Professor/ CSE Department, Jeppiaar Engineering College, Chennai, TamilNadu,India 600119. Dr.K.L.Shunmuganathan

More information

Beyond passwords: Protect the mobile enterprise with smarter security solutions

Beyond passwords: Protect the mobile enterprise with smarter security solutions IBM Software Thought Leadership White Paper September 2013 Beyond passwords: Protect the mobile enterprise with smarter security solutions Prevent fraud and improve the user experience with an adaptive

More information

Streamlining Web and Email Security

Streamlining Web and Email Security How to Protect Your Business from Malware, Phishing, and Cybercrime The SMB Security Series Streamlining Web and Email Security sponsored by Introduction to Realtime Publishers by Don Jones, Series Editor

More information

CyberArk Privileged Threat Analytics. Solution Brief

CyberArk Privileged Threat Analytics. Solution Brief CyberArk Privileged Threat Analytics Solution Brief Table of Contents The New Security Battleground: Inside Your Network...3 Privileged Account Security...3 CyberArk Privileged Threat Analytics : Detect

More information

Classification, Detection and Prosecution of Fraud on Mobile Networks

Classification, Detection and Prosecution of Fraud on Mobile Networks Classification, Detection and Prosecution of Fraud on Mobile Networks (1) Vodafone Ltd, The Courtyard, 2-4 London Road, Newbury, Berkshire, RG14 1JX, England (2) ICRI, KU Leuven, Tiensestraat 41, B-3000

More information

TABLE OF CONTENT. Page 2 of 9 INTERNET FIREWALL POLICY

TABLE OF CONTENT. Page 2 of 9 INTERNET FIREWALL POLICY IT FIREWALL POLICY TABLE OF CONTENT 1. INTRODUCTION... 3 2. TERMS AND DEFINITION... 3 3. PURPOSE... 5 4. SCOPE... 5 5. POLICY STATEMENT... 5 6. REQUIREMENTS... 5 7. OPERATIONS... 6 8. CONFIGURATION...

More information