Using Clustering Techniques to Analyze Fraudulent Behavior Changes in Online Auctions
|
|
|
- Aileen Eaton
- 10 years ago
- Views:
Transcription
1 201O International Conference on Networking and Information Technology Using Clustering Techniques to Analyze Fraudulent Behavior Changes in Online Auctions Wen-Hsi Chang/TamKang University Graduate Institute of Management Sciences Taipei, Taiwan Jau-Shien Chang/TamKang University Department of Information Management Taipei, Taiwan Abstract---schemed fraudsters often flip behavior in terms of circumstances change as camouflage for disguising malicious actions in online auctions. For instance, the fake transaction records interwoven with real trades are indistinguishable from legitimate transaction histories. The ways of fraudulent behavior changes formulate different types of tricks for swindling. To avoid trading with fraudsters, recognizing the types of fraudulent behavior changes in advance is helpful in choosing appropriate trading partners. Therefore, in order to distinguish the types of behavior changes from different fraudsters, clustering techniques were applied such as X Means for grouping in characteristics. Afterwards, C4.5 decision trees were employed for inducing the rules of the labeled clusters. In this study, the real transaction data of 236 proven fraudsters was collected from Yahoo!Taiwan for testing. The experimental results demonstrate that the fraudsters are categorized into 4 natural groups and the vast majority of fraudster, 93% of fraudsters on average, follows certain default models to develop a scam. The findings of this study also make online auction early fraud detection possible. Keywords- clustering; detection decision trees; e-commerce; fraud I. INTRODUCTION According to the statistics of Internet Crime Complaint Center in recent years, online auction fraud has been occupying a large of percentage [1]. Each schemed fraud was prepared by deliberate premeditation, not by coincidence. The preparation work of a fraud is a composite of behavior for appealing innocent buyers, such as raising reputation score. To keep away online auction fraud, recognizing the categories of fraud is helpful in choosing appropriate countermeasure against different fraudsters. Originally, open information of transaction histories is useful to evaluate the credit level of a trading partner. However, those schemed fraudsters take it as a swindling tool to raise reputation for deceiving traders. Since schemed fraudsters are good at disguising malicious behavior and interleaving regular trades in their transaction histories, online auction fraud has become one of the most serious threats in Internet frauds [2]. For instance, they usually manipulate feedback score with different tricks for fabricating reputation. Notwithstanding their schemes for raising reputation score could be quite sophisticated, it is inevitable to leave some traits of committing a fraud in their transaction histories, such as the trend and magnitude of irregular behavior changes. This kind of abnormal behavior definitely reflects the results on the corresponding reputation management systems. Unfortunately, most traders could identify only part of fraudulent behavior without assistance from other information resources in usual. One of most popular solutions is to apply a fraud detection system that comprises known behavior patterns to inspect a suspicious account. Apparently, the kind of detection systems is only able to detect those fraudsters whose features of behavior exactly matched with existing behavior patterns. However, fraudsters often change their tricks in terms of current circumstances dynamically. As a consequence, even a minor modification for a trick could affect the accuracy of a fraud detection mechanism. Notwithstanding the changes of behavior cause fraud detection harder, the fraudulent behavior still follows certain models or types in most cases of fraud. It is impossible to make a detection mechanism has zero misclassification as expected as a new upcoming trick. Therefore, we take the types of tricks for committing a fraud into account to categorize fraudulent behavior into natural groups in characteristics by clustering techniques. Afterwards, to identify a minor modification of a trick, the similarities to existing fraudulent patterns were calculated to enhance the effectiveness of fraud detection. In this study, we employ X-means algorithm, which is an extended K-means clustering technique to categorize 236 proven fraudsters in Yahoo!Taiwan into 4 types. In addition, C4.5 algorithm was applied for explaining the rules of the labeled fraudulent behaviors respectively. The rest of this article is organized as follows: Section 2 is related work on types of anomalies, measuring attributes in online auctions, and clustering techniques we applied. The third section is to discuss the behavior changes from our observations. Section 4 presents fraudster clusters in nature. The experimental results are presented in the section 5. Finally, conclusions and future research directions are mentioned in the last section. II. RELATED WORK In the real world, there are merely very few auction fraudsters among the vast majority of legitimate accounts. That is, a fraudster is anomaly in online auctions. Therefore, online auction fraudster detection refers to the problem of finding fraudulent behavior patterns in transaction histories. In other words, online auction fraud detection is one of applications of anomaly detection. The capability of identifying anomalies determines the effectiveness of a fraud detection system. For instance, Rubin proposes a new /$ IEEE 34
2 reputation system to help buyers identify anomalies in online auctions such as sellers whose auctions seem price-inflated etc. [3]. A. Types of Anomalies The key components of anomaly detection are in associated with anomaly detection techniques including the nature of the data, availability of labeled data, and type of anomalies to be detected. In addition, anomalies can be categorized into following 3 types [4]: 1) Point Anomalies 2) Contextual Anomalies 3) Collective Anomalies. Such as a credit card fraud defined only amount spent IS a classic point anomaly. If a transaction were higher than the normal range of expediture for that person, it could be a fraud. Similar situations often occur in online auctions, traders usually estimate corresponding trading partners only with accumulated negative ratings. According to our observations, a schemed fraud could be a contextual anomaly. Contextual anomalies refer to an instance is anomalous in a specific condition, but not otherwise. The context of an instance is a part of fraud formulation. For instance, the more positive ratings a trading participant got the higher reputatio? he has.? wever,. he context of a fraud is the high density of obtammg POSItIve ratings which implies reputation inflation. Therefore, each case of fraud should be defined by contextual attributrtes and behavior attributes. Not only can a fraudster activate a fraud, but also a group of accounts as well such as an accomplice syndicate. However, not all members of the syndicate committed the fraud and most of them look like legitimate accounts individually. As the collection of related instances is anomalous with respect to the entire data set is a collective anomaly. The three types of anomalies could be presented in online auctions in general. B. Measuring Attributes A set of appropriate measuring attributes can depict the features of a fraudster precisely for building behavior models. To observe the behavior changes of a fraudster, first and foremost is to determine which measuring attributes are applied to describe fraudulent behavior. Chau nd Faloutsos defined 17 price-oriented attributes for detectmg fraudsters in online auctions [5] that includes median prices of items sold, median prices of items bought, standard deviation of the prices of items sold, standard deviation of the prices of items bought within the first 15, first 30, last 30, and last 15 days in transaction history. In addition, the. ratio of the number of items bought to that of all transactions. The 17 measuring attributes could depict point anomalies and part of contextual anomalies with statistic values. To enhance the capability of depicting contextual anomalies, Chang and Chang defined another 7 measuring attributes as follows [8]: 1) Density of obtaining positive ratings 2) Density of obtaining negative ratings 3) Density of obtaining positive ratings after closing bid 4) Given being a seller, the ratio of positive ratings from other sellers to all positive ratings 5) Time difference fr om the last negative rating to the current time 6) Ratio of positive ratings to total fe edback count 7) Ratio of negative ratings to total fe edback The combination of the two sets of measuring attributes above that includes behavior attributes, statistic attributes and particular values was proven effective in online auction fraudster detection in previous experiments. C. Clustering Techniques Clustering is a simple and straightforward technique to partition instances into disjoint grou p s [9]. -me. ans i one of the classic clustering techniques usmg an IteratIve distancebased algorithm. In advance, the number of clusters sh? uld be determined by the parameter k. Then the algorithm randomly chooses k points as cluster centers. Subsequ ntly, each instance is assigned into its closest center by EuclIdean distance function, as finding the nearest neighbors. And then the mean of distances in each cluster will be calculated as new center values for each cluster. That is, each iteration process produces a set of cluster centers and all instances must be examined and assigned to the nearest center. Iteration processes will be continued until e. same poin. ts are assigned to each cluster centers has stabilized and will remain the same forever [7]. In following experiments, we apply X-Means instead, which is an extended version of K-means clustering technique in structure implemented by Weka 3.7.0, to analyze behavior changes of online auction fraudsters who are divided into natural groups by characteristics rather than default classes. X-Means improves the computation for searching the space of cluster locations and number clusters to optimize the Bayesian Information Criterion (BIC) measure. It attempts to split the center in its region. X-Means compares the BIC-values of the two structures between the children of each center and itself to make decision [8]. III. BEHAVIOR OF FRAUDSTERS A. Earlier-Phased Behavior Simulation A premeditated scam is a sophisticated composite of behavior changes along with different tricks. According to our observations, there always is a certain break point between activating a fraud and its preparation work. In general, the descriptive value of each measuring attribute reflects the level of current situation or behavior status only. Therefore it is difficult to identify a fraudster who fabricates feedback ' score as camouflage with bare eyes, especially before a fraud being activated. To simulate the processes of fraudulent preparation work, we extract part of the transaction histories for monitoring easily. To look backward the earlier stages of a fraudster, cut the last part of transaction history off is a si p. le nd straight approach for simulation. Thus, a phased partitioning method 35
3 was developed for simulating the behavior in the earlier phases. The transaction history of a proven fraudster was suspended or voided is treated as a complete lifespan. Hence, we used the percentage of a transaction history to indicate the earlier phases for preparation work. Both the count of ratings and the duration of being at the site could present the lifespan of a fraudster. In addition, a transaction history consists of information about positive ratings and negative ratings mainly. Hence, Chang and Chang proposed a method for partitioning transaction histories as follows [6]: The length of a transaction history equals to the count of accumulated ratings. Thus, given an account u and its transaction history TH(u)= {tr], tr], "', trn}, the r'% lifespan of u is denoted as TH(u, rolo)={trj, tr], "', tr d}, and d=f n*rolo l. The rolo of transaction history indicates the rolo earlier stage of its lifespan. For example, if n=60, then TH(u, 80%) would be { trj, tr2, "', tr48 }(See Fig. 1) Obtaining a positive rating - Obtaining a negative rating r---- Phase 900'" ----II 54 "60' 90% Phase 80"A. ---tl 48::; 60' 80% r n 3ed r r 4T T Tr T Day 1 Figure 1. Phased partition of transaction history B. Difficulites of Fraudulent Behavior Detection There are several challenges and difficulties of online auction fraud detection as follows: 1) The boundary between normal and anomalous behavior is often not precise 2) In online auctions, fraudulent behavior keeps evolving and stable patterns might not sufficiently depict fr audsters. 3) Sometimes a small deviation fr om legitimate accounts might be an anomaly, but some fluctuations in the trading behavior might be considered as normal. 4) It is very difficult to distinguish the behavior of a normal account containing noise data is simliar to actual anomalies. To resolve the problems above, we adopt a composite of measuring attributes that includes statistic values, particular values, and contextual values for depicting fraudulent features from different aspects. We use some statistic value as measuring attributes to take the advantage of controlling small deviation. In particular, we emphasize the contextual situation description in following experiments for compensating the ambiguity between normal and abnormal behavior. Therefore, some particular values were introduced for indicating certain contextual situations. To overcome these difficulties above, the method of partitioning transaction histories is integrated with previous measured attributes for emulating the preparation status of a fraudster, the latency period. IV. FRAUDSTER CLUSTERS There are different ways in which the result of clustering can be expressed [7](See Fig. 2). The presentation of clustering should be dictated by the nature of the problem that is thought to underlie the particular situation. Thus, fraudulent behavior may be categorized into exclusive or overlapping clusters. Moreover, it is possible to have these clusters in hierarchical or partial hierarchical structure. For instance, it could be a crude division of instances at top level. Or the clusters are identified probilistic, whereby and instance belongs to each group with a certain probability. BBrl----, B B '-----::_---' (A) Exclusive clusters Figure 2. (8) Overlapping clusters Cluster 1 Cluster 2 Cluster 3 Cluster 4 Different presentation of clusters (C) Hierarchical clusters Behavior fluctuation between regularity and irregularity is a common trick for schemed fraudsters. The features in the different phases of a latency period could be clustered into different clusters for pragmatic reasons. The following experiments are to discover the types of fraudulent behavior changes via clustering. V. EXPERIMENTAL RESULTS AND ANALYSES We collected the real transaction histories of 236 proven fraudsters from the blacklist of Yahoo! Taiwan. The transaction history of each fraudster was partitioned into 5 different phases based on the count of accumulated ratings, such as phase 100%, 95%, 90%, 85% and 80%, as Fig. 1 illustrates. Each specific phased profile contains 24 numeric values by the proposed composite of measuring attributes. As a result, the data set consists of 1,180 phased profiles for testing. The experimental results explain that most fraudsters followed specific formulations to change behavior. A. Results of Clustering by Mixed Phased Profiles Table 1 shows the experimental results of applying X Means algorithm, in which the total 1,180 fraudulent profiles are categorized into 4 clusters. The 4 clusters indicate the nature of online auction fraudsters in characteristics, which are respectively labeled by type 0, type 1, type 3 and type 4. Type 2 is the most popular type of fraud that occupied 39% of online auction fraudsters. Type 0 occupied 31 %, type 1 is 21% and only 9% of fraudsters belong to type 3. The rest of them were shown in Tablel. TABLE I. RESULTS OF CLUSTERING Type 0 Type l Type 2 Type 3 Phase100% Phase 95% Phase 90%
4 Phase 85% Phase 80% Total *Percentaee 31% 21% 39% 9% 73 * The percentage for each individual type of fraudulent behavior being occupied According the statistics of Table 2, almost all different phased profiles of an account have been assigned into the same cluster. Performing continuous tests to a suspect before he has been suspended would be helpful in identifying to which type of fraudster he belongs. 96% of Type 0 fraudsters could be identified into the same type whether their current phases are. The rest of types presents similar results, 92% of Type 1, 96% of Type 2 and 87% of Type 3 could be identified in all 5 phases. Averagely, 93% of fraudsters will follow the same model to develop their schemes during the entire lifespan. The results denote the behavior of most fraudsters might be detected as early as possible. TABLE II. THE COUNTS OF BEING IDENTIFIED BY PHASED PROFILES Phases *Percental!:e Averal!:e Type I 71 96% 0 93% Type I 0 2 I 48 92% 1 Type I 2 I % 2 Type 0 I % 3 *The percentage of the type of fraudsters can be categorized into the same type in all 5 phases According to the experimental results, most fraudsters will keep following a type of model in general. Fig. 3 shows that general fraudulent behavior develops linearly with respect to phase progress. F(100%) F(85%) F(80%) F(100%) F(90%) Features in phase r%, F(r%) Origin Figure 3. General Type 0 behavior General Type 1 behavior General Type 2 behavior General Type 3 behavior TWeO, TWe1, TW : TWeJ 80% 85% 90% 95% 100% Phase General Fraudulent Behavior Development Some cunning fraudsters' behavior may not always keep stable to develop with respect to the same model as expected. In addition, even a regular legitimate account sometimes will change their purchase habits instantly as well under certain circumstances. B. Early Fraud Detection Since the experimental results above indicate that the X Means clustering algorithm could identify a fraudster in the earlier phases before the time of being suspended (Phase 100%), it makes early online fraud detection possible. To discover the implications of different types, C4.5 decision trees were employed with 10 folds cross validation to induce rules as follows: DensityOtNeg <= I RatioOfBuyingRate <= I I MdSellingFirst15 <= I I I DensityOtNeg <= : Typel I I I DensityOtNeg > I I I I MdSellingFirst15 <= 2570: Type I I I I I MdSellingFirst15 > 2570: TypeO I I MdSellingFirst15> 11150: TypeO RatioOfBuyingRate > I RatioOfBuyingRate <= I I MdBuyingFirst15 <= 900: Typel I I MdBuyingFirst15 > 900: Type3 I I RatioOfBuyingRate > 0.375: Type3 DensityOtNeg > I RatioOfBuyingRate <= I I StdBuyingFirst15 <= : TypeO I I StdBuyingFirst15 > I I I MdBuyingFirst15 <= 305: Typ I I I MdBuyingFirst15 > 305: TypeO I RatioOfBuyingRate > I MdSellingFirst30 <= 9300 I StdSellingFirst15 <= I StdBuyingFirst30 <= I StdSellingLast15 <= I I StdBuyingLast30 <= : Typ I I StdBuyingLast30 > I I I DensityOtNeg <= Type3 I I I DensityOtNeg > : Typ I StdSellingLastl5 > I I MdBuyingFirst15 <= 450: TypeO I I MdBuyingFirst15 > 450: Typ StdBuyingFirst30> I MdBuyingFirst15 <= 3050: Type3 I MdBuyingFirst15 > 3050: Typ StdSellingFirst15 > I MdSellingLast15 <= 11400: Type3 I I MdSell inglast I 5 > 11400: Typ MdSellingFirst30> 9300: TypeO We scrutinize the above rules, most Type 2 and Type 3 fraudsters got higher density of obtaining negative ratings, and the prices of purchased commodities are lower on average. They might raise credit by purchasing the items with smaller amount money. On the contrary, Type 0 and Type 1 of fraudsters got lower density of obtaining negative ratings, and the higher prices of purchased goods. Type 0 is traditional premeditated fraudster who got very high density of obtaining positive ratings for raising credit score in very short period. Type 3 could be a kind of very cunning schemed fraudster who used to flip behavior to appeal targets. All 1,180 profiles were labeled by the results of clustering; the averaged recall rate is 98.6% (see Table 3). TABLE III. OUTCOME OF ApPLYING C4.5 DECISION TREES TPRate FPRate Precision Recall F-Measure Class Type Type I Type Type 3 37
5 Table 4 shows that prediction errors occurred in the previous experiment by C4.5. The results demonstrate that Type 0 and Type 3 are exclusive, but Type 0 might be misidentified as Type 1 or Type 2. Type 1 could only be misidentified as Type3, neither the other types. Type 2 and Type 1 are exclusive too. It is more difficult to identify Type fraudster by clustering. C. TABLE IY. PREDICTION ERRORS Type 0 Type 1 Type 2 Type 3 * Type 0 X * Tvpe 1 X X * Tvpe 2 X X * Tvpe 3 X X Changes of Fraudulent Behavior *Predicted class; X denotes incorrect predicted class Referring to Table 5, there are 7 of the 236 fraudsters who switch their behavior into different types during the entire lifespan. The 7 instances indicate the phenomenon of overlapping clusters in Section 4. Table 5 also explains why the changes of fraudulent behavior affect the accuracy of clustering. TABLEV. Fraudster PhaselOO% T Phase 95% T Phase 90% T Phase 85% T Phase 80% T FRAUDULENT BERA VIOR SWITRING PATTERNS e3 e3 We listed the results of clustering for analyzing in details. There are 3 kinds of behavior switching patterns between types that include Type 0 and Type 1, Type 1 and Type 2, and Type 2 and Type3. Whereas, Type 0 and Type 3 are mutually exclusive as follows: 1) Type 0 and Type 1: Fraudster AI, A2, A3 2) Type I and Type 2: Fraudster A4 3) Type 2 and Type 3: Fraudster A5, A6, A7 Behavior Features In Phase r% Feature(Phase "lor,... j Type A1 A ;0.' Acoounl A ::: : Type 1,./"! j Type 2 -.::.-....": t,..., 1 - i : ! ; -..., Account A7 : Type 3 80% 85% 90% 95% 1000/0 Phase Figure 4. 2 Examples of fraudulent behavior switching Fig. 4 shows account Al and A2 didn't follow the default models linearly with respect to corresponding phases. Fraudster A 1 develops his scheme in phase 80%-85% as Type 0, and then he changes his behavior into Type 1 when he steps into phase 90%. Account A 7 originally develops his scheme as Type 2, whereas he turned his scheme into Type 3 at the point of phase 90%. VI. CONCLUSIONS AND FUTURE WORK Fraudulent behavior changes have been perplexing the construction of fraudulent behavior models. The experimental results show that applying clustering techniques to categorize fraudulent behavior into natural groups is helpful in discovering fraudsters as early as possible. In spite of which kind of tricks fraudsters apply, the vast majority of fraudster will still follow certain models to develop scams in general. In other words, this finding infers that fraudster could be identified in their earlier phases, even in phase 80%. The similarity metrics of clustering techniques are similar to instance-based learning classification methods. In clustering, it groups different behavior in which all instances are processed altogether, whereas applying instance-based learning algorithm of which test instances are calculated individually. Therefore, the results of clustering could be applied to label outcome classes for enhancing the functionality of a supervised classification method. Hence, we are going to apply different instance-based learning algorithms to identify fraudulent behavior as the future research directions. In addition, hierarchical cluster analyses could be conducted to refine fraudulent behavior changes in details for further applications. [I] REFERENCES National White Collar Crime and the Federal Bureau Investigation, "2009 Internet Crime Report", Mar. 2010, pp.i-28, [2] B. Gavish and C. L. Tucci, "Reducing Internet Auction Fraud., Commun. ACM, vol. 51, no. 5, May. 2008, pp doi: / [3] S. Rubin, M. Christodorescu, V. Ganapathy, J. T. Giffin, L. Kruger,, H. Wang and N. Kidd, "An auctioning reputation system based on anomaly," In Proceedings of the 12th ACM Conference on Computer and Communications Security (CCS 05), Alexandria, VA, USA, Nov. 7-11, 2005, pp [4] Y. Chandola, A. Banerjee, and V. Kumar, "Anomaly detection A survey," ACM Comput. Surv. vol. 41, no.3, Jul. 2009, pp. I-58, doi: / [5] D. H. Chau and C. Faloutsos, "Fraud Detection in Electronic Auction," in Proceedings of European Web Mining Forum (EWMF 2005) at ECMLIPKDD, Oct. 3-7, 2005 [6] J. S Chang and W. H. Chang, "An Early Fraud Detection Mechanism for Online Auctions Based on Phased Modeling," In Proceedings of The 2009 International Workshop on Mobile Systems, E-commerce and Agent Technology (MSEA T2009), Taipei, Taiwan, Dec. 3-5, 2009 [7] I. H. Witten and E. Frank, "Data mining: Practical machine learning tools and techniques", San Francisco: Morgan Kaufmann, 2005, pp [8] D. Pelleg and A. W. Moore, "X-means: Extending K-means with Efficient Estimation of the Number of Clusters," In Proceedings of the Seventeenth international Conference on Machine Learning, June 29 - July 02, 2000, P. Langley, Ed. Morgan Kaufmann Publishers, San Francisco, CA, 2000, pp
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining
Data Mining Clustering (2) Toon Calders Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Outline Partitional Clustering Distance-based K-means, K-medoids,
Online Fraud Detection Model Based on Social Network Analysis
Journal of Information & Computational Science :7 (05) 553 5 May, 05 Available at http://www.joics.com Online Fraud Detection Model Based on Social Network Analysis Peng Wang a,b,, Ji Li a,b, Bigui Ji
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
Data Mining Project Report. Document Clustering. Meryem Uzun-Per
Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 [email protected]
Robust Outlier Detection Technique in Data Mining: A Univariate Approach
Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,
Evaluating Online Payment Transaction Reliability using Rules Set Technique and Graph Model
Evaluating Online Payment Transaction Reliability using Rules Set Technique and Graph Model Trung Le 1, Ba Quy Tran 2, Hanh Dang Thi My 3, Thanh Hung Ngo 4 1 GSR, Information System Lab., University of
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three
OUTLIER ANALYSIS. Data Mining 1
OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,
Using News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,
K-Means Cluster Analysis. Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1
K-Means Cluster Analsis Chapter 3 PPDM Class Tan,Steinbach, Kumar Introduction to Data Mining 4/18/4 1 What is Cluster Analsis? Finding groups of objects such that the objects in a group will be similar
Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing
www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University
Clustering Marketing Datasets with Data Mining Techniques
Clustering Marketing Datasets with Data Mining Techniques Özgür Örnek International Burch University, Sarajevo [email protected] Abdülhamit Subaşı International Burch University, Sarajevo [email protected]
A Fast Fraud Detection Approach using Clustering Based Method
pp. 33-37 Krishi Sanskriti Publications http://www.krishisanskriti.org/jbaer.html A Fast Detection Approach using Clustering Based Method Surbhi Agarwal 1, Santosh Upadhyay 2 1 M.tech Student, Mewar University,
Context-aware taxi demand hotspots prediction
Int. J. Business Intelligence and Data Mining, Vol. 5, No. 1, 2010 3 Context-aware taxi demand hotspots prediction Han-wen Chang, Yu-chin Tai and Jane Yung-jen Hsu* Department of Computer Science and Information
ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION
ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical
Classification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet
Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca
Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?
IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES
IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES Bruno Carneiro da Rocha 1,2 and Rafael Timóteo de Sousa Júnior 2 1 Bank of Brazil, Brasília-DF, Brazil [email protected] 2 Network Engineering
Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data
Fifth International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter, Hiroshima University, Japan, November 10, 11 & 12, 2009 Extension of Decision Tree Algorithm for Stream
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
International Journal of Advanced Computer Technology (IJACT) ISSN:2319-7900 PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS
PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS First A. Dr. D. Aruna Kumari, Ph.d, ; Second B. Ch.Mounika, Student, Department Of ECM, K L University, [email protected]; Third C.
A Novel Classification Approach for C2C E-Commerce Fraud Detection
A Novel Classification Approach for C2C E-Commerce Fraud Detection *1 Haitao Xiong, 2 Yufeng Ren, 2 Pan Jia *1 School of Computer and Information Engineering, Beijing Technology and Business University,
Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors
Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann
Insider Threat Detection Using Graph-Based Approaches
Cybersecurity Applications & Technology Conference For Homeland Security Insider Threat Detection Using Graph-Based Approaches William Eberle Tennessee Technological University [email protected] Lawrence
An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset
P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang
Bisecting K-Means for Clustering Web Log data
Bisecting K-Means for Clustering Web Log data Ruchika R. Patil Department of Computer Technology YCCE Nagpur, India Amreen Khan Department of Computer Technology YCCE Nagpur, India ABSTRACT Web usage mining
Spam detection with data mining method:
Spam detection with data mining method: Ensemble learning with multiple SVM based classifiers to optimize generalization ability of email spam classification Keywords: ensemble learning, SVM classifier,
Categorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
Cluster Analysis: Basic Concepts and Algorithms
Cluster Analsis: Basic Concepts and Algorithms What does it mean clustering? Applications Tpes of clustering K-means Intuition Algorithm Choosing initial centroids Bisecting K-means Post-processing Strengths
A Survey on Outlier Detection Techniques for Credit Card Fraud Detection
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. VI (Mar-Apr. 2014), PP 44-48 A Survey on Outlier Detection Techniques for Credit Card Fraud
A Secure Online Reputation Defense System from Unfair Ratings using Anomaly Detections
A Secure Online Reputation Defense System from Unfair Ratings using Anomaly Detections Asha baby PG Scholar,Department of CSE A. Kumaresan Professor, Department of CSE K. Vijayakumar Professor, Department
How To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
A Content based Spam Filtering Using Optical Back Propagation Technique
A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
Data Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution
Recognizing The Theft of Identity Using Data Mining
Recognizing The Theft of Identity Using Data Mining Aniruddha Kshirsagar 1, Lalit Dole 2 1,2 CSE Department, GHRCE, Nagpur, Maharashtra, India Abstract Identity fraud is the great matter of concern in
Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Adaptive Anomaly Detection for Network Security
International Journal of Computer and Internet Security. ISSN 0974-2247 Volume 5, Number 1 (2013), pp. 1-9 International Research Publication House http://www.irphouse.com Adaptive Anomaly Detection for
Healthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,
Site-Specific versus General Purpose Web Search Engines: A Comparative Evaluation
Panhellenic Conference on Informatics Site-Specific versus General Purpose Web Search Engines: A Comparative Evaluation G. Atsaros, D. Spinellis, P. Louridas Department of Management Science and Technology
A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique
A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique Aida Parbaleh 1, Dr. Heirsh Soltanpanah 2* 1 Department of Computer Engineering, Islamic Azad University, Sanandaj
Predicting Student Performance by Using Data Mining Methods for Classification
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
Detecting E-mail Spam Using Spam Word Associations
Detecting E-mail Spam Using Spam Word Associations N.S. Kumar 1, D.P. Rana 2, R.G.Mehta 3 Sardar Vallabhbhai National Institute of Technology, Surat, India 1 [email protected] 2 [email protected]
Efficient Data Recovery scheme in PTS-Based OFDM systems with MATRIX Formulation
Efficient Data Recovery scheme in PTS-Based OFDM systems with MATRIX Formulation Sunil Karthick.M PG Scholar Department of ECE Kongu Engineering College Perundurau-638052 Venkatachalam.S Assistant Professor
Web Forensic Evidence of SQL Injection Analysis
International Journal of Science and Engineering Vol.5 No.1(2015):157-162 157 Web Forensic Evidence of SQL Injection Analysis 針 對 SQL Injection 攻 擊 鑑 識 之 分 析 Chinyang Henry Tseng 1 National Taipei University
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
Enhanced Boosted Trees Technique for Customer Churn Prediction Model
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
Maximizing Return and Minimizing Cost with the Decision Management Systems
KDD 2012: Beijing 18 th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Rich Holada, Vice President, IBM SPSS Predictive Analytics Maximizing Return and Minimizing Cost with the Decision Management
A Review of Anomaly Detection Techniques in Network Intrusion Detection System
A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In
E-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati [email protected], [email protected]
How To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]
Standardization and Its Effects on K-Means Clustering Algorithm
Research Journal of Applied Sciences, Engineering and Technology 6(7): 399-3303, 03 ISSN: 040-7459; e-issn: 040-7467 Maxwell Scientific Organization, 03 Submitted: January 3, 03 Accepted: February 5, 03
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analsis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining b Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining /8/ What is Cluster
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Spam Detection on Twitter Using Traditional Classifiers M. McCord CSE Dept Lehigh University 19 Memorial Drive West Bethlehem, PA 18015, USA
Spam Detection on Twitter Using Traditional Classifiers M. McCord CSE Dept Lehigh University 19 Memorial Drive West Bethlehem, PA 18015, USA [email protected] M. Chuah CSE Dept Lehigh University 19 Memorial
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE S. Anupama Kumar 1 and Dr. Vijayalakshmi M.N 2 1 Research Scholar, PRIST University, 1 Assistant Professor, Dept of M.C.A. 2 Associate
Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set
Overview Evaluation Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes training set, validation set, test set holdout, stratification
APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION ANALYSIS. email [email protected]
Eighth International IBPSA Conference Eindhoven, Netherlands August -4, 2003 APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION Christoph Morbitzer, Paul Strachan 2 and
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Classification of Learners Using Linear Regression
Proceedings of the Federated Conference on Computer Science and Information Systems pp. 717 721 ISBN 978-83-60810-22-4 Classification of Learners Using Linear Regression Marian Cristian Mihăescu Software
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Business Process Mining for Internal Fraud Risk Reduction: Results of a Case Study
Business Process Mining for Internal Fraud Risk Reduction: Results of a Case Study Mieke Jans, Nadine Lybaert, and Koen Vanhoof Hasselt University, Agoralaan Building D, 3590 Diepenbeek, Belgium http://www.uhasselt.be/
The Optimality of Naive Bayes
The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most
In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999
In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999 A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka
COURSE RECOMMENDER SYSTEM IN E-LEARNING
International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand
Implementation of hybrid software architecture for Artificial Intelligence System
IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.1, January 2007 35 Implementation of hybrid software architecture for Artificial Intelligence System B.Vinayagasundaram and
Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems
Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems G.Rajina #1, P.Nagaraju #2 #1 M.Tech, Computer Science Engineering, TallaPadmavathi Engineering College, Warangal,
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for
Financial Trading System using Combination of Textual and Numerical Data
Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,
Unsupervised learning: Clustering
Unsupervised learning: Clustering Salissou Moutari Centre for Statistical Science and Operational Research CenSSOR 17 th September 2013 Unsupervised learning: Clustering 1/52 Outline 1 Introduction What
Predicting borrowers chance of defaulting on credit loans
Predicting borrowers chance of defaulting on credit loans Junjie Liang ([email protected]) Abstract Credit score prediction is of great interests to banks as the outcome of the prediction algorithm
A QoS-Aware Web Service Selection Based on Clustering
International Journal of Scientific and Research Publications, Volume 4, Issue 2, February 2014 1 A QoS-Aware Web Service Selection Based on Clustering R.Karthiban PG scholar, Computer Science and Engineering,
Local outlier detection in data forensics: data mining approach to flag unusual schools
Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential
